[python-hdf5storage] 52/84: Updated documentation for the 0.1.5 release.
Ghislain Vaillant
ghisvail-guest at moszumanska.debian.org
Mon Feb 29 08:25:03 UTC 2016
This is an automated email from the git hooks/post-receive script.
ghisvail-guest pushed a commit to annotated tag 0.1.10
in repository python-hdf5storage.
commit 97364e7b44a5bd8baa213df3e8f9884a67fed8ef
Author: Freja Nordsiek <fnordsie at gmail.com>
Date: Sun May 17 20:05:43 2015 -0400
Updated documentation for the 0.1.5 release.
---
README.rst | 98 +++++++++++++++++++++++-------------------
doc/source/storage_format.rst | 99 +++++++++++++++++++++++--------------------
2 files changed, 109 insertions(+), 88 deletions(-)
diff --git a/README.rst b/README.rst
index 1fe98c7..dbb3491 100644
--- a/README.rst
+++ b/README.rst
@@ -43,15 +43,13 @@ Python 2
This package was designed and written for Python 3, with Python 2.7 and
2.6 support added later. This does mean that a few things are a little
-clunky in Python 2. Examples include supporting ``unicode`` keys for
-dictionaries, not being able to import a structured ``numpy.ndarray`` if
-any of its fields contain characters outside of ASCII, the ``int`` and
-``long`` types both being mapped to the Python 3 ``int`` type, etc. The
-storage format's metadata looks more familiar from a Python 3 standpoint
-as well.
-
-All documentation and examples are written in terms of Python 3 syntax
-and types. Important Python 2 information beyond direct translations of
+clunky in Python 2. Examples include requiring ``unicode`` keys for
+dictionaries, the ``int`` and ``long`` types both being mapped to the
+Python 3 ``int`` type, etc. The storage format's metadata looks more
+familiar from a Python 3 standpoint as well.
+
+The documentation is written in terms of Python 3 syntax and types
+primarily. Important Python 2 information beyond direct translations of
syntax and types will be pointed out.
Hierarchal Data Format 5 (HDF5)
@@ -108,19 +106,19 @@ Type Version Converted to Class Version
=============== ======= ========================== =========== ==============
bool 0.1 np.bool\_ or np.uint8 logical 0.1 [1]_
None 0.1 ``np.float64([])`` ``[]`` 0.1
-int [2]_ 0.1 np.int64 [2]_ int64 0.1
-long [3]_ 0.1 np.int64 int64 0.1
+int [2]_ [3]_ 0.1 np.int64 [2]_ int64 0.1
+long [3]_ [4]_ 0.1 np.int64 int64 0.1
float 0.1 np.float64 double 0.1
complex 0.1 np.complex128 double 0.1
-str 0.1 np.uint32/16 char 0.1 [4]_
-bytes 0.1 np.bytes\_ or np.uint16 char 0.1 [5]_
-bytearray 0.1 np.bytes\_ or np.uint16 char 0.1 [5]_
+str 0.1 np.uint32/16 char 0.1 [5]_
+bytes 0.1 np.bytes\_ or np.uint16 char 0.1 [6]_
+bytearray 0.1 np.bytes\_ or np.uint16 char 0.1 [6]_
list 0.1 np.object\_ cell 0.1
tuple 0.1 np.object\_ cell 0.1
set 0.1 np.object\_ cell 0.1
frozenset 0.1 np.object\_ cell 0.1
cl.deque 0.1 np.object\_ cell 0.1
-dict 0.1 struct 0.1 [6]_
+dict 0.1 struct 0.1 [7]_
np.bool\_ 0.1 logical 0.1
np.void 0.1
np.uint8 0.1 uint8 0.1
@@ -131,51 +129,53 @@ np.uint8 0.1 int8 0.1
np.int16 0.1 int16 0.1
np.int32 0.1 int32 0.1
np.int64 0.1 int64 0.1
-np.float16 [7]_ 0.1
+np.float16 [8]_ 0.1
np.float32 0.1 single 0.1
np.float64 0.1 double 0.1
np.complex64 0.1 single 0.1
np.complex128 0.1 double 0.1
-np.str\_ 0.1 np.uint32/16 char/uint32 0.1 [4]_
-np.bytes\_ 0.1 np.bytes\_ or np.uint16 char 0.1 [5]_
+np.str\_ 0.1 np.uint32/16 char/uint32 0.1 [5]_
+np.bytes\_ 0.1 np.bytes\_ or np.uint16 char 0.1 [6]_
np.object\_ 0.1 cell 0.1
-np.ndarray 0.1 [8]_ [9]_ [8]_ [9]_ 0.1 [8]_ [10]_
-np.matrix 0.1 [8]_ [8]_ 0.1 [8]_
-np.chararray 0.1 [8]_ [8]_ 0.1 [8]_
-np.recarray 0.1 structured np.ndarray [8]_ [9]_ 0.1 [8]_
+np.ndarray 0.1 [9]_ [10]_ [9]_ [10]_ 0.1 [9]_ [11]_
+np.matrix 0.1 [9]_ [9]_ 0.1 [9]_
+np.chararray 0.1 [9]_ [9]_ 0.1 [9]_
+np.recarray 0.1 structured np.ndarray [9]_ [10]_ 0.1 [9]_
=============== ======= ========================== =========== ==============
.. [1] Depends on the selected options. Always ``np.uint8`` when doing
MATLAB compatiblity, or if the option is explicitly set.
.. [2] In Python 2.x, it may be read back as a ``long`` if it can't fit
in the size of an ``int``.
-.. [3] Type only found in Python 2.x. Python 2.x's ``long`` and ``int``
+.. [3] Must be small enough to fit into an ``np.int64``.
+.. [4] Type found only in Python 2.x. Python 2.x's ``long`` and ``int``
are unified into a single ``int`` type in Python 3.x. Read as an
``int`` in Python 3.x.
-.. [4] Depends on the selected options and whether it can be converted
+.. [5] Depends on the selected options and whether it can be converted
to UTF-16 without using doublets. If the option is explicity set
- (or implicitly through doing MATLAB compatibility) and it can be
+ (or implicitly when doing MATLAB compatibility) and it can be
converted to UTF-16 without losing any characters that can't be
represented in UTF-16 or using UTF-16 doublets (MATLAB doesn't
support them), then it is written as ``np.uint16`` in UTF-16
encoding. Otherwise, it is stored at ``np.uint32`` in UTF-32
encoding.
-.. [5] Depends on the selected options. If the option is explicitly set
- (or implicitly through doing MATLAB compatibility), it will be
- stored as ``np.uint16`` in UTF-16 encoding. Otherwise, it is just
- written as ``np.bytes_``.
-.. [6] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
-.. [7] ``np.float16`` are not supported for h5py versions before
+.. [6] Depends on the selected options. If the option is explicitly set
+ (or implicitly when doing MATLAB compatibility), it will be
+ stored as ``np.uint16`` in UTF-16 encoding unless it has
+ non-ASCII characters in which case a ``NotImplementedError`` is
+ thrown). Otherwise, it is just written as ``np.bytes_``.
+.. [7] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
+.. [8] ``np.float16`` are not supported for h5py versions before
``2.2``.
-.. [8] Container types are only supported if their underlying dtype is
+.. [9] Container types are only supported if their underlying dtype is
supported. Data conversions are done based on its dtype.
-.. [9] Structured ``np.ndarray`` s (have fields in their dtypes) can be
- written as an HDF5 COMPOUND type or as an HDF5 Group with Datasets
- holding its fields (either the values directly, or as an HDF5
- Reference array to the values for the different elements of the
- data). Can only be written as an HDF5 COMPOUND type if none of
- its field are of dtype ``'object'``.
-.. [10] Structured ``np.ndarray`` s with no elements, when written like a
+.. [10] Structured ``np.ndarray`` s (have fields in their dtypes) can be
+ written as an HDF5 COMPOUND type or as an HDF5 Group with
+ Datasets holding its fields (either the values directly, or as
+ an HDF5 Reference array to the values for the different elements
+ of the data). Can only be written as an HDF5 COMPOUND type if
+ none of its field are of dtype ``'object'``.
+.. [11] Structured ``np.ndarray`` s with no elements, when written like a
structure, will not be read back with the right dtypes for their
fields (will all become 'object').
@@ -187,8 +187,8 @@ type they are read as.
MATLAB Class Version Python Type
=============== ======= =================================
logical 0.1 np.bool\_
-single 0.1 np.float32 or np.complex64 [11]_
-double 0.1 np.float64 or np.complex128 [11]_
+single 0.1 np.float32 or np.complex64 [12]_
+double 0.1 np.float64 or np.complex128 [12]_
uint8 0.1 np.uint8
uint16 0.1 np.uint16
uint32 0.1 np.uint32
@@ -203,13 +203,25 @@ cell 0.1 np.object\_
canonical empty 0.1 ``np.float64([])``
=============== ======= =================================
-.. [11] Depends on whether there is a complex part or not.
+.. [12] Depends on whether there is a complex part or not.
Versions
========
-0.1.5. Bugfix release fixing a bug where an ``int`` could be stored that is too big to fit into an ``int`` when read back in Python 2.x. When it is too big, it is converted to a ``long``.
+0.1.5. Bugfix release fixing the following bug.
+ * Fixed bug where an ``int`` could be stored that is too big to
+ fit into an ``int`` when read back in Python 2.x. When it is
+ too big, it is converted to a ``long``.
+ * Fixed a bug where an ``int`` or ``long`` that is too big to
+ big to fit into an ``np.int64`` raised the wrong exception.
+ * Fixed bug where fields names for structured ``np.ndarray`` with
+ non-ASCII characters (assumed to be UTF-8 encoded in
+ Python 2.x) can't be read or written properly.
+ * Fixed bug where ``np.bytes_`` with non-ASCII characters can
+ were converted incorrectly to UTF-16 when that option is set
+ (set implicitly when doing MATLAB compatibility). Now, it throws
+ a ``NotImplementedError``.
0.1.4. Bugfix release fixing the following bugs. Thanks goes to `mrdomino <https://github.com/mrdomino>`_ for writing the bug fixes.
* Fixed bug where ``dtype`` is used as a keyword parameter of
diff --git a/doc/source/storage_format.rst b/doc/source/storage_format.rst
index 08a873a..932a744 100644
--- a/doc/source/storage_format.rst
+++ b/doc/source/storage_format.rst
@@ -59,19 +59,19 @@ Type Version Converted to Group or Dataset
=============== ======= ==================================== =====================
bool 0.1 np.bool\_ or np.uint8 [1]_ Dataset
None 0.1 ``np.float64([])`` Dataset
-int [2]_ 0.1 np.int64 [2]_ Dataset
-long [3]_ 0.1 np.int64 Dataset
+int [2]_ [3]_ 0.1 np.int64 [2]_ Dataset
+long [3]_ [4]_ 0.1 np.int64 Dataset
float 0.1 np.float64 Dataset
complex 0.1 np.complex128 Dataset
-str 0.1 np.uint32/16 [4]_ Dataset
-bytes 0.1 np.bytes\_ or np.uint16 [5]_ Dataset
-bytearray 0.1 np.bytes\_ or np.uint16 [5]_ Dataset
+str 0.1 np.uint32/16 [5]_ Dataset
+bytes 0.1 np.bytes\_ or np.uint16 [6]_ Dataset
+bytearray 0.1 np.bytes\_ or np.uint16 [6]_ Dataset
list 0.1 np.object\_ Dataset
tuple 0.1 np.object\_ Dataset
set 0.1 np.object\_ Dataset
frozenset 0.1 np.object\_ Dataset
cl.deque 0.1 np.object\_ Dataset
-dict [6]_ 0.1 Group
+dict [7]_ 0.1 Group
np.bool\_ 0.1 not or np.uint8 [1]_ Dataset
np.void 0.1 Dataset
np.uint8 0.1 Dataset
@@ -82,18 +82,18 @@ np.uint8 0.1 Dataset
np.int16 0.1 Dataset
np.int32 0.1 Dataset
np.int64 0.1 Dataset
-np.float16 [7]_ 0.1 Dataset
+np.float16 [8]_ 0.1 Dataset
np.float32 0.1 Dataset
np.float64 0.1 Dataset
np.complex64 0.1 Dataset
np.complex128 0.1 Dataset
-np.str\_ 0.1 np.uint32/16 [4]_ Dataset
-np.bytes\_ 0.1 np.bytes\_ or np.uint16 [5]_ Dataset
+np.str\_ 0.1 np.uint32/16 [5]_ Dataset
+np.bytes\_ 0.1 np.bytes\_ or np.uint16 [6]_ Dataset
np.object\_ 0.1 Dataset
-np.ndarray 0.1 not or Group of contents [8]_ Dataset or Group [8]_
+np.ndarray 0.1 not or Group of contents [9]_ Dataset or Group [9]_
np.matrix 0.1 np.ndarray Dataset
-np.chararray 0.1 np.bytes\_ or np.uint16/32 [4]_ [5]_ Dataset
-np.recarray 0.1 structured np.ndarray [8]_ Dataset or Group [8]_
+np.chararray 0.1 np.bytes\_ or np.uint16/32 [5]_ [6]_ Dataset
+np.recarray 0.1 structured np.ndarray [9]_ Dataset or Group [9]_
=============== ======= ==================================== =====================
.. [1] Depends on the selected options. Always ``np.uint8`` when
@@ -101,10 +101,11 @@ np.recarray 0.1 structured np.ndarray [8]_ Dataset or Group
``matlab_compatible == True``).
.. [2] In Python 2.x, it may be read back as a ``long`` if it can't fit
in the size of an ``int``.
-.. [3] Type only found in Python 2.x. Python 2.x's ``long`` and ``int``
+.. [3] Must be small enough to fit into an ``np.int64``.
+.. [4] Type only found in Python 2.x. Python 2.x's ``long`` and ``int``
are unified into a single ``int`` type in Python 3.x. Read as an
``int`` in Python 3.x.
-.. [4] Depends on the selected options and whether it can be converted
+.. [5] Depends on the selected options and whether it can be converted
to UTF-16 without using doublets. If
``convert_numpy_str_to_utf16 == True`` (set implicitly when
``matlab_compatible == True``) and it can be converted to UTF-16
@@ -112,15 +113,16 @@ np.recarray 0.1 structured np.ndarray [8]_ Dataset or Group
or using UTF-16 doublets (MATLAB doesn't support them), then it
is written as ``np.uint16`` in UTF-16 encoding. Otherwise, it is
stored at ``np.uint32`` in UTF-32 encoding.
-.. [5] Depends on the selected options. If
+.. [6] Depends on the selected options. If
``convert_numpy_bytes_to_utf16 == True`` (set implicitly when
``matlab_compatible == True``), it will be stored as
- ``np.uint16`` in UTF-16 encoding. Otherwise, it is just written
- as ``np.bytes_``.
-.. [6] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
-.. [7] ``np.float16`` are not supported for h5py versions before
+ ``np.uint16`` in UTF-16 encoding unless it contains non-ASCII
+ characters in which case a ``NotImplementedError`` is raised.
+ Otherwise, it is just written as ``np.bytes_``.
+.. [7] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
+.. [8] ``np.float16`` are not supported for h5py versions before
``2.2``.
-.. [8] If it doesn't have any fields in its dtype or if
+.. [9] If it doesn't have any fields in its dtype or if
:py:attr:`Options.structured_numpy_ndarray_as_struct` is not set
and none of its fields are of dtype ``'object'``, it is not
converted and is written as is as a Dataset. Otherwise, it
@@ -158,9 +160,9 @@ int 'int' 'int64' 'int64'
long 'long' 'int64' 'int64'
float 'float' 'float64' 'double'
complex 'complex' 'complex128' 'double'
-str 'str' 'str#' [9]_ 'char' 2
-bytes 'bytes' 'bytes#' [9]_ 'char' 2
-bytearray 'bytearray' 'bytes#' [9]_ 'char' 2
+str 'str' 'str#' [10]_ 'char' 2
+bytes 'bytes' 'bytes#' [10]_ 'char' 2
+bytearray 'bytearray' 'bytes#' [10]_ 'char' 2
list 'list' 'object' 'cell'
tuple 'tuple' 'object' 'cell'
set 'set' 'object' 'cell'
@@ -168,7 +170,7 @@ frozenset 'frozenset' 'object' 'cell'
cl.deque 'collections.deque' 'object' 'cell'
dict 'dict' 'struct'
np.bool\_ 'numpy.bool' 'bool' 'logical' 1
-np.void 'numpy.void' 'void#' [9]_
+np.void 'numpy.void' 'void#' [10]_
np.uint8 'numpy.uint8' 'uint8' 'uint8'
np.uint16 'numpy.uint16' 'uint16' 'uint16'
np.uint32 'numpy.uint32' 'uint32' 'uint32'
@@ -182,23 +184,23 @@ np.float32 'numpy.float32' 'float32' 'single'
np.float64 'numpy.float64' 'float64' 'double'
np.complex64 'numpy.complex64' 'complex64' 'single'
np.complex128 'numpy.complex128' 'complex128' 'double'
-np.str\_ 'numpy.str\_' 'str#' [9]_ 'char' or 'uint32' 2 or 4 [10]_
-np.bytes\_ 'numpy.bytes\_' 'bytes#' [9]_ 'char' 2
+np.str\_ 'numpy.str\_' 'str#' [10]_ 'char' or 'uint32' 2 or 4 [11]_
+np.bytes\_ 'numpy.bytes\_' 'bytes#' [10]_ 'char' 2
np.object\_ 'numpy.object\_' 'object' 'cell'
-np.ndarray 'numpy.ndarray' [11]_ [11]_ [12]_
-np.matrix 'numpy.matrix' [11]_ [11]_
-np.chararray 'numpy.chararray' [11]_ 'char' [11]_
-np.recarray 'numpy.recarray' [11]_ [11]_ [12]_
+np.ndarray 'numpy.ndarray' [12]_ [12]_ [13]_
+np.matrix 'numpy.matrix' [12]_ [12]_
+np.chararray 'numpy.chararray' [12]_ 'char' [12]_
+np.recarray 'numpy.recarray' [12]_ [12]_ [13]_
============= =================== =========================== ================== =================
-.. [9] '#' is replaced by the number of bits taken up by the string, or
- each string in the case that it is an array of strings. This is 8
- and 32 bits per character for ``np.bytes_`` and ``np.str_``
- respectively.
-.. [10] ``2`` if it is stored as ``np.uint16`` or ``4`` if ``np.uint32``.
-.. [11] The value that would be put in for a scalar of the same dtype is
+.. [10] '#' is replaced by the number of bits taken up by the string, or
+ each string in the case that it is an array of strings. This is 8
+ and 32 bits per character for ``np.bytes_`` and ``np.str_``
+ respectively.
+.. [11] ``2`` if it is stored as ``np.uint16`` or ``4`` if ``np.uint32``.
+.. [12] The value that would be put in for a scalar of the same dtype is
used.
-.. [12] If it is structured (its dtype has fields),
+.. [13] If it is structured (its dtype has fields),
:py:attr:`Options.structured_numpy_ndarray_as_struct` is set,
and none of its fields are of dtype ``'object'``; it is set to
``'struct'`` overriding anything else.
@@ -322,6 +324,12 @@ and the fact that the interpreter in Python 2.x could be using 32-bits
``int``, it is possible that a value could be read that is too large
to fit into ``int``. When that happens, it read as a ``long`` instead.
+.. warning::
+
+ Writing Python 2.x ``long`` and Python 3.x ``int`` too big to fit
+ into an ``np.int64`` is not supported. A ``NotImplementedError`` is
+ raised if attempted.
+
Complex Numbers
---------------
@@ -385,11 +393,6 @@ HDF5 COMPOUND type.
can't be read back from the file accurately. The dtype for all the
fields will become 'object' instead of what they originally were.
-.. warning::
-
- In Python 2, importing structured ``np.ndarray`` s will produce an
- error if any of their fields have characters outside of ASCII.
-
Optional Data Transformations
=============================
@@ -461,6 +464,12 @@ Whether all ``np.bytes_`` strings (or things converted to it) should be
converted to UTF-16 and written as an array of ``np.uint16`` or not. This
option is set to ``True`` implicitly by ``matlab_compatible``.
+.. warning::
+
+ Only ASCII characters are supported in ``np.bytes_`` when this
+ option is set. A ``NotImplementedError`` is raised if any non-ASCII
+ characters are present.
+
convert_numpy_str_to_utf16
--------------------------
@@ -518,8 +527,8 @@ type they are read as if there is no Python metadata attached to them.
MATLAB Class Version Python Type
=============== ======= =================================
logical 0.1 np.bool\_
-single 0.1 np.float32 or np.complex64 [13]_
-double 0.1 np.float64 or np.complex128 [13]_
+single 0.1 np.float32 or np.complex64 [14]_
+double 0.1 np.float64 or np.complex128 [14]_
uint8 0.1 np.uint8
uint16 0.1 np.uint16
uint32 0.1 np.uint32
@@ -534,4 +543,4 @@ cell 0.1 np.object\_
canonical empty 0.1 ``np.float64([])``
=============== ======= =================================
-.. [13] Depends on whether there is a complex part or not.
+.. [14] Depends on whether there is a complex part or not.
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/python-hdf5storage.git
More information about the debian-science-commits
mailing list