[python-hdf5storage] 52/84: Updated documentation for the 0.1.5 release.

Mon Feb 29 08:25:03 UTC 2016

This is an automated email from the git hooks/post-receive script.

ghisvail-guest pushed a commit to annotated tag 0.1.10
in repository python-hdf5storage.

commit 97364e7b44a5bd8baa213df3e8f9884a67fed8ef
Author: Freja Nordsiek <fnordsie at gmail.com>
Date:   Sun May 17 20:05:43 2015 -0400

    Updated documentation for the 0.1.5 release.
---
 README.rst                    | 98 +++++++++++++++++++++++-------------------
 doc/source/storage_format.rst | 99 +++++++++++++++++++++++--------------------
 2 files changed, 109 insertions(+), 88 deletions(-)

diff --git a/README.rst b/README.rst
index 1fe98c7..dbb3491 100644
--- a/README.rst
+++ b/README.rst
@@ -43,15 +43,13 @@ Python 2
 
 This package was designed and written for Python 3, with Python 2.7 and
 2.6 support added later. This does mean that a few things are a little
-clunky in Python 2. Examples include supporting ``unicode`` keys for
-dictionaries, not being able to import a structured ``numpy.ndarray`` if
-any of its fields contain characters outside of ASCII, the ``int`` and
-``long`` types both being mapped to the Python 3 ``int`` type, etc. The
-storage format's metadata looks more familiar from a Python 3 standpoint
-as well.
-
-All documentation and examples are written in terms of Python 3 syntax
-and types. Important Python 2 information beyond direct translations of
+clunky in Python 2. Examples include requiring ``unicode`` keys for
+dictionaries, the ``int`` and ``long`` types both being mapped to the
+Python 3 ``int`` type, etc. The storage format's metadata looks more
+familiar from a Python 3 standpoint as well.
+
+The documentation is written in terms of Python 3 syntax and types
+primarily. Important Python 2 information beyond direct translations of
 syntax and types will be pointed out.
 
 Hierarchal Data Format 5 (HDF5)
@@ -108,19 +106,19 @@ Type             Version  Converted to                Class        Version
 ===============  =======  ==========================  ===========  ==============
 bool             0.1      np.bool\_ or np.uint8       logical      0.1 [1]_
 None             0.1      ``np.float64([])``          ``[]``       0.1
-int [2]_         0.1      np.int64 [2]_               int64        0.1
-long [3]_        0.1      np.int64                    int64        0.1
+int [2]_ [3]_    0.1      np.int64 [2]_               int64        0.1
+long [3]_ [4]_   0.1      np.int64                    int64        0.1
 float            0.1      np.float64                  double       0.1
 complex          0.1      np.complex128               double       0.1
-str              0.1      np.uint32/16                char         0.1 [4]_
-bytes            0.1      np.bytes\_ or np.uint16     char         0.1 [5]_
-bytearray        0.1      np.bytes\_ or np.uint16     char         0.1 [5]_
+str              0.1      np.uint32/16                char         0.1 [5]_
+bytes            0.1      np.bytes\_ or np.uint16     char         0.1 [6]_
+bytearray        0.1      np.bytes\_ or np.uint16     char         0.1 [6]_
 list             0.1      np.object\_                 cell         0.1
 tuple            0.1      np.object\_                 cell         0.1
 set              0.1      np.object\_                 cell         0.1
 frozenset        0.1      np.object\_                 cell         0.1
 cl.deque         0.1      np.object\_                 cell         0.1
-dict             0.1                                  struct       0.1 [6]_
+dict             0.1                                  struct       0.1 [7]_
 np.bool\_        0.1                                  logical      0.1
 np.void          0.1
 np.uint8         0.1                                  uint8        0.1
@@ -131,51 +129,53 @@ np.uint8         0.1                                  int8         0.1
 np.int16         0.1                                  int16        0.1
 np.int32         0.1                                  int32        0.1
 np.int64         0.1                                  int64        0.1
-np.float16 [7]_  0.1
+np.float16 [8]_  0.1
 np.float32       0.1                                  single       0.1
 np.float64       0.1                                  double       0.1
 np.complex64     0.1                                  single       0.1
 np.complex128    0.1                                  double       0.1
-np.str\_         0.1      np.uint32/16                char/uint32  0.1 [4]_
-np.bytes\_       0.1      np.bytes\_ or np.uint16     char         0.1 [5]_
+np.str\_         0.1      np.uint32/16                char/uint32  0.1 [5]_
+np.bytes\_       0.1      np.bytes\_ or np.uint16     char         0.1 [6]_
 np.object\_      0.1                                  cell         0.1
-np.ndarray       0.1      [8]_ [9]_                   [8]_ [9]_    0.1 [8]_ [10]_
-np.matrix        0.1      [8]_                        [8]_         0.1 [8]_
-np.chararray     0.1      [8]_                        [8]_         0.1 [8]_
-np.recarray      0.1      structured np.ndarray       [8]_ [9]_    0.1 [8]_
+np.ndarray       0.1      [9]_ [10]_                  [9]_ [10]_   0.1 [9]_ [11]_
+np.matrix        0.1      [9]_                        [9]_         0.1 [9]_
+np.chararray     0.1      [9]_                        [9]_         0.1 [9]_
+np.recarray      0.1      structured np.ndarray       [9]_ [10]_   0.1 [9]_
 ===============  =======  ==========================  ===========  ==============
 
 .. [1] Depends on the selected options. Always ``np.uint8`` when doing
        MATLAB compatiblity, or if the option is explicitly set.
 .. [2] In Python 2.x, it may be read back as a ``long`` if it can't fit
        in the size of an ``int``.
-.. [3] Type only found in Python 2.x. Python 2.x's ``long`` and ``int``
+.. [3] Must be small enough to fit into an ``np.int64``.
+.. [4] Type found only in Python 2.x. Python 2.x's ``long`` and ``int``
        are unified into a single ``int`` type in Python 3.x. Read as an
        ``int`` in Python 3.x.
-.. [4] Depends on the selected options and whether it can be converted
+.. [5] Depends on the selected options and whether it can be converted
        to UTF-16 without using doublets. If the option is explicity set
-       (or implicitly through doing MATLAB compatibility) and it can be
+       (or implicitly when doing MATLAB compatibility) and it can be
        converted to UTF-16 without losing any characters that can't be
        represented in UTF-16 or using UTF-16 doublets (MATLAB doesn't
        support them), then it is written as ``np.uint16`` in UTF-16
        encoding. Otherwise, it is stored at ``np.uint32`` in UTF-32
        encoding.
-.. [5] Depends on the selected options. If the option is explicitly set
-       (or implicitly through doing MATLAB compatibility), it will be
-       stored as ``np.uint16`` in UTF-16 encoding. Otherwise, it is just
-       written as ``np.bytes_``.
-.. [6] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
-.. [7] ``np.float16`` are not supported for h5py versions before
+.. [6] Depends on the selected options. If the option is explicitly set
+       (or implicitly when doing MATLAB compatibility), it will be
+       stored as ``np.uint16`` in UTF-16 encoding unless it has
+       non-ASCII characters in which case a ``NotImplementedError`` is
+       thrown). Otherwise, it is just written as ``np.bytes_``.
+.. [7] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
+.. [8] ``np.float16`` are not supported for h5py versions before
        ``2.2``.
-.. [8] Container types are only supported if their underlying dtype is
+.. [9] Container types are only supported if their underlying dtype is
        supported. Data conversions are done based on its dtype.
-.. [9] Structured ``np.ndarray`` s (have fields in their dtypes) can be
-       written as an HDF5 COMPOUND type or as an HDF5 Group with Datasets
-       holding its fields (either the values directly, or as an HDF5
-       Reference array to the values for the different elements of the
-       data). Can only be written as an HDF5 COMPOUND type if none of
-       its field are of dtype ``'object'``.
-.. [10] Structured ``np.ndarray`` s with no elements, when written like a
+.. [10] Structured ``np.ndarray`` s (have fields in their dtypes) can be
+        written as an HDF5 COMPOUND type or as an HDF5 Group with
+        Datasets holding its fields (either the values directly, or as
+        an HDF5 Reference array to the values for the different elements
+        of the data). Can only be written as an HDF5 COMPOUND type if
+        none of its field are of dtype ``'object'``.
+.. [11] Structured ``np.ndarray`` s with no elements, when written like a
         structure, will not be read back with the right dtypes for their
         fields (will all become 'object').
 
@@ -187,8 +187,8 @@ type they are read as.
 MATLAB Class     Version  Python Type
 ===============  =======  =================================
 logical          0.1      np.bool\_
-single           0.1      np.float32 or np.complex64 [11]_
-double           0.1      np.float64 or np.complex128 [11]_
+single           0.1      np.float32 or np.complex64 [12]_
+double           0.1      np.float64 or np.complex128 [12]_
 uint8            0.1      np.uint8
 uint16           0.1      np.uint16
 uint32           0.1      np.uint32
@@ -203,13 +203,25 @@ cell             0.1      np.object\_
 canonical empty  0.1      ``np.float64([])``
 ===============  =======  =================================
 
-.. [11] Depends on whether there is a complex part or not.
+.. [12] Depends on whether there is a complex part or not.
 
 
 Versions
 ========
 
-0.1.5. Bugfix release fixing a bug where an ``int`` could be stored that is too big to fit into an ``int`` when read back in Python 2.x. When it is too big, it is converted to a ``long``.
+0.1.5. Bugfix release fixing the following bug.
+       * Fixed bug where an ``int`` could be stored that is too big to
+         fit into an ``int`` when read back in Python 2.x. When it is
+         too big, it is converted to a ``long``.
+       * Fixed a bug where an ``int`` or ``long`` that is too big to
+	 big to fit into an ``np.int64`` raised the wrong exception.
+       * Fixed bug where fields names for structured ``np.ndarray`` with
+         non-ASCII characters (assumed to be UTF-8 encoded in
+         Python 2.x) can't be read or written properly.
+       * Fixed bug where ``np.bytes_`` with non-ASCII characters can
+         were converted incorrectly to UTF-16 when that option is set
+         (set implicitly when doing MATLAB compatibility). Now, it throws
+         a ``NotImplementedError``.
 
 0.1.4. Bugfix release fixing the following bugs. Thanks goes to `mrdomino <https://github.com/mrdomino>`_ for writing the bug fixes.
        * Fixed bug where ``dtype`` is used as a keyword parameter of
diff --git a/doc/source/storage_format.rst b/doc/source/storage_format.rst
index 08a873a..932a744 100644
--- a/doc/source/storage_format.rst
+++ b/doc/source/storage_format.rst
@@ -59,19 +59,19 @@ Type             Version  Converted to                          Group or Dataset
 ===============  =======  ====================================  =====================
 bool             0.1      np.bool\_ or np.uint8 [1]_            Dataset
 None             0.1      ``np.float64([])``                    Dataset
-int [2]_         0.1      np.int64 [2]_                         Dataset
-long [3]_        0.1      np.int64                              Dataset
+int [2]_ [3]_    0.1      np.int64 [2]_                         Dataset
+long [3]_ [4]_   0.1      np.int64                              Dataset
 float            0.1      np.float64                            Dataset
 complex          0.1      np.complex128                         Dataset
-str              0.1      np.uint32/16 [4]_                     Dataset
-bytes            0.1      np.bytes\_ or np.uint16 [5]_          Dataset
-bytearray        0.1      np.bytes\_ or np.uint16 [5]_          Dataset
+str              0.1      np.uint32/16 [5]_                     Dataset
+bytes            0.1      np.bytes\_ or np.uint16 [6]_          Dataset
+bytearray        0.1      np.bytes\_ or np.uint16 [6]_          Dataset
 list             0.1      np.object\_                           Dataset
 tuple            0.1      np.object\_                           Dataset
 set              0.1      np.object\_                           Dataset
 frozenset        0.1      np.object\_                           Dataset
 cl.deque         0.1      np.object\_                           Dataset
-dict [6]_        0.1                                            Group
+dict [7]_        0.1                                            Group
 np.bool\_        0.1      not or np.uint8 [1]_                  Dataset
 np.void          0.1                                            Dataset
 np.uint8         0.1                                            Dataset
@@ -82,18 +82,18 @@ np.uint8         0.1                                            Dataset
 np.int16         0.1                                            Dataset
 np.int32         0.1                                            Dataset
 np.int64         0.1                                            Dataset
-np.float16 [7]_  0.1                                            Dataset
+np.float16 [8]_  0.1                                            Dataset
 np.float32       0.1                                            Dataset
 np.float64       0.1                                            Dataset
 np.complex64     0.1                                            Dataset
 np.complex128    0.1                                            Dataset
-np.str\_         0.1      np.uint32/16 [4]_                     Dataset
-np.bytes\_       0.1      np.bytes\_ or np.uint16 [5]_          Dataset
+np.str\_         0.1      np.uint32/16 [5]_                     Dataset
+np.bytes\_       0.1      np.bytes\_ or np.uint16 [6]_          Dataset
 np.object\_      0.1                                            Dataset
-np.ndarray       0.1      not or Group of contents [8]_         Dataset or Group [8]_
+np.ndarray       0.1      not or Group of contents [9]_         Dataset or Group [9]_
 np.matrix        0.1      np.ndarray                            Dataset
-np.chararray     0.1      np.bytes\_ or np.uint16/32 [4]_ [5]_  Dataset
-np.recarray      0.1      structured np.ndarray [8]_            Dataset or Group [8]_
+np.chararray     0.1      np.bytes\_ or np.uint16/32 [5]_ [6]_  Dataset
+np.recarray      0.1      structured np.ndarray [9]_            Dataset or Group [9]_
 ===============  =======  ====================================  =====================
 
 .. [1] Depends on the selected options. Always ``np.uint8`` when
@@ -101,10 +101,11 @@ np.recarray      0.1      structured np.ndarray [8]_            Dataset or Group
        ``matlab_compatible == True``).
 .. [2] In Python 2.x, it may be read back as a ``long`` if it can't fit
        in the size of an ``int``.
-.. [3] Type only found in Python 2.x. Python 2.x's ``long`` and ``int``
+.. [3] Must be small enough to fit into an ``np.int64``.
+.. [4] Type only found in Python 2.x. Python 2.x's ``long`` and ``int``
        are unified into a single ``int`` type in Python 3.x. Read as an
        ``int`` in Python 3.x.
-.. [4] Depends on the selected options and whether it can be converted
+.. [5] Depends on the selected options and whether it can be converted
        to UTF-16 without using doublets. If
        ``convert_numpy_str_to_utf16 == True`` (set implicitly when
        ``matlab_compatible == True``) and it can be converted to UTF-16
@@ -112,15 +113,16 @@ np.recarray      0.1      structured np.ndarray [8]_            Dataset or Group
        or using UTF-16 doublets (MATLAB doesn't support them), then it
        is written as ``np.uint16`` in UTF-16 encoding. Otherwise, it is
        stored at ``np.uint32`` in UTF-32 encoding.
-.. [5] Depends on the selected options. If
+.. [6] Depends on the selected options. If
        ``convert_numpy_bytes_to_utf16 == True`` (set implicitly when
        ``matlab_compatible == True``), it will be stored as
-       ``np.uint16`` in UTF-16 encoding. Otherwise, it is just written
-       as ``np.bytes_``.
-.. [6] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
-.. [7] ``np.float16`` are not supported for h5py versions before
+       ``np.uint16`` in UTF-16 encoding unless it contains non-ASCII
+       characters in which case a ``NotImplementedError`` is raised.
+       Otherwise, it is just written as ``np.bytes_``.
+.. [7] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
+.. [8] ``np.float16`` are not supported for h5py versions before
        ``2.2``.
-.. [8] If it doesn't have any fields in its dtype or if
+.. [9] If it doesn't have any fields in its dtype or if
        :py:attr:`Options.structured_numpy_ndarray_as_struct` is not set
        and none of its fields are of dtype ``'object'``, it is not
        converted and is written as is as a Dataset. Otherwise, it
@@ -158,9 +160,9 @@ int            'int'                'int64'                      'int64'
 long           'long'               'int64'                      'int64'
 float          'float'              'float64'                    'double'
 complex        'complex'            'complex128'                 'double'
-str            'str'                'str#' [9]_                  'char'              2
-bytes          'bytes'              'bytes#' [9]_                'char'              2
-bytearray      'bytearray'          'bytes#' [9]_                'char'              2
+str            'str'                'str#' [10]_                 'char'              2
+bytes          'bytes'              'bytes#' [10]_               'char'              2
+bytearray      'bytearray'          'bytes#' [10]_               'char'              2
 list           'list'               'object'                     'cell'
 tuple          'tuple'              'object'                     'cell'
 set            'set'                'object'                     'cell'
@@ -168,7 +170,7 @@ frozenset      'frozenset'          'object'                     'cell'
 cl.deque       'collections.deque'  'object'                     'cell'
 dict           'dict'                                            'struct'
 np.bool\_      'numpy.bool'         'bool'                       'logical'           1
-np.void        'numpy.void'         'void#' [9]_
+np.void        'numpy.void'         'void#' [10]_
 np.uint8       'numpy.uint8'        'uint8'                      'uint8'
 np.uint16      'numpy.uint16'       'uint16'                     'uint16'
 np.uint32      'numpy.uint32'       'uint32'                     'uint32'
@@ -182,23 +184,23 @@ np.float32     'numpy.float32'      'float32'                    'single'
 np.float64     'numpy.float64'      'float64'                    'double'
 np.complex64   'numpy.complex64'    'complex64'                  'single'
 np.complex128  'numpy.complex128'   'complex128'                 'double'
-np.str\_       'numpy.str\_'        'str#' [9]_                  'char' or 'uint32'  2 or 4 [10]_
-np.bytes\_     'numpy.bytes\_'      'bytes#' [9]_                'char'              2
+np.str\_       'numpy.str\_'        'str#' [10]_                 'char' or 'uint32'  2 or 4 [11]_
+np.bytes\_     'numpy.bytes\_'      'bytes#' [10]_               'char'              2
 np.object\_    'numpy.object\_'     'object'                     'cell'
-np.ndarray     'numpy.ndarray'      [11]_                        [11]_ [12]_
-np.matrix      'numpy.matrix'       [11]_                        [11]_
-np.chararray   'numpy.chararray'    [11]_                        'char' [11]_
-np.recarray    'numpy.recarray'     [11]_                        [11]_ [12]_
+np.ndarray     'numpy.ndarray'      [12]_                        [12]_ [13]_
+np.matrix      'numpy.matrix'       [12]_                        [12]_
+np.chararray   'numpy.chararray'    [12]_                        'char' [12]_
+np.recarray    'numpy.recarray'     [12]_                        [12]_ [13]_
 =============  ===================  ===========================  ==================  =================
 
-.. [9] '#' is replaced by the number of bits taken up by the string, or
-       each string in the case that it is an array of strings. This is 8
-       and 32 bits per character for ``np.bytes_`` and ``np.str_``
-       respectively.
-.. [10] ``2`` if it is stored as ``np.uint16`` or ``4`` if ``np.uint32``.
-.. [11] The value that would be put in for a scalar of the same dtype is
+.. [10] '#' is replaced by the number of bits taken up by the string, or
+        each string in the case that it is an array of strings. This is 8
+        and 32 bits per character for ``np.bytes_`` and ``np.str_``
+        respectively.
+.. [11] ``2`` if it is stored as ``np.uint16`` or ``4`` if ``np.uint32``.
+.. [12] The value that would be put in for a scalar of the same dtype is
        used.
-.. [12] If it is structured (its dtype has fields),
+.. [13] If it is structured (its dtype has fields),
         :py:attr:`Options.structured_numpy_ndarray_as_struct` is set,
         and none of its fields are of dtype ``'object'``; it is set to
         ``'struct'`` overriding anything else.
@@ -322,6 +324,12 @@ and the fact that the interpreter in Python 2.x could be using 32-bits
 ``int``, it is possible that a value could be read that is too large
 to fit into ``int``. When that happens, it read as a ``long`` instead.
 
+.. warning::
+
+   Writing Python 2.x ``long`` and Python 3.x ``int`` too big to fit
+   into an ``np.int64`` is not supported. A ``NotImplementedError`` is
+   raised if attempted.
+
 
 Complex Numbers
 ---------------
@@ -385,11 +393,6 @@ HDF5 COMPOUND type.
    can't be read back from the file accurately. The dtype for all the
    fields will become 'object' instead of what they originally were.
 
-.. warning::
-
-   In Python 2, importing structured ``np.ndarray`` s will produce an
-   error if any of their fields have characters outside of ASCII.
-
 
 Optional Data Transformations
 =============================
@@ -461,6 +464,12 @@ Whether all ``np.bytes_`` strings (or things converted to it) should be
 converted to UTF-16 and written as an array of ``np.uint16`` or not. This
 option is set to ``True`` implicitly by ``matlab_compatible``.
 
+.. warning::
+
+   Only ASCII characters are supported in ``np.bytes_`` when this
+   option is set. A ``NotImplementedError`` is raised if any non-ASCII
+   characters are present.
+
 convert_numpy_str_to_utf16
 --------------------------
 
@@ -518,8 +527,8 @@ type they are read as if there is no Python metadata attached to them.
 MATLAB Class     Version  Python Type
 ===============  =======  =================================
 logical          0.1      np.bool\_
-single           0.1      np.float32 or np.complex64 [13]_
-double           0.1      np.float64 or np.complex128 [13]_
+single           0.1      np.float32 or np.complex64 [14]_
+double           0.1      np.float64 or np.complex128 [14]_
 uint8            0.1      np.uint8
 uint16           0.1      np.uint16
 uint32           0.1      np.uint32
@@ -534,4 +543,4 @@ cell             0.1      np.object\_
 canonical empty  0.1      ``np.float64([])``
 ===============  =======  =================================
 
-.. [13] Depends on whether there is a complex part or not.
+.. [14] Depends on whether there is a complex part or not.

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/python-hdf5storage.git