[python-hdf5storage] 16/84: Updated documentation to reflect code changes and new version.

Ghislain Vaillant ghisvail-guest at moszumanska.debian.org
Mon Feb 29 08:24:58 UTC 2016


This is an automated email from the git hooks/post-receive script.

ghisvail-guest pushed a commit to annotated tag 0.1.10
in repository python-hdf5storage.

commit a15966cc0880ad7c58958e564547b32d246e9054
Author: Freja Nordsiek <fnordsie at gmail.com>
Date:   Thu Aug 14 18:24:45 2014 -0400

    Updated documentation to reflect code changes and new version.
---
 README.rst                             | 124 ++++++++++++--------
 doc/source/development.rst             |  33 ++++--
 doc/source/hdf5storage.Marshallers.rst |   4 +-
 doc/source/introduction.rst            |  38 ++++++-
 doc/source/storage_format.rst          | 201 +++++++++++++++++++--------------
 5 files changed, 256 insertions(+), 144 deletions(-)

diff --git a/README.rst b/README.rst
index 5095f47..df86265 100644
--- a/README.rst
+++ b/README.rst
@@ -96,48 +96,48 @@ will be what it is read back as) the MATLAB class it becomes if
 targetting a MAT file, and the first version of this package to
 support writing it so MATlAB can read it.
 
-=============  =======  ==========================  ===========  =============
-Python                                              MATLAB
---------------------------------------------------  --------------------------
-Type           Version  Converted to                Class        Version
-=============  =======  ==========================  ===========  =============
-bool           0.1      np.bool\_ or np.uint8       logical      0.1 [1]_
-None           0.1      ``np.float64([])``          ``[]``       0.1
-int            0.1      np.int64                    int64        0.1
-float          0.1      np.float64                  double       0.1
-complex        0.1      np.complex128               double       0.1
-str            0.1      np.uint32/16                char         0.1 [2]_
-bytes          0.1      np.bytes\_ or np.uint16     char         0.1 [3]_
-bytearray      0.1      np.bytes\_ or np.uint16     char         0.1 [3]_
-list           0.1      np.object\_                 cell         0.1
-tuple          0.1      np.object\_                 cell         0.1
-set            0.1      np.object\_                 cell         0.1
-frozenset      0.1      np.object\_                 cell         0.1
-cl.deque       0.1      np.object\_                 cell         0.1
-dict           0.1                                  struct       0.1 [4]_
-np.bool\_      0.1                                  logical      0.1
-np.void        0.1
-np.uint8       0.1                                  uint8        0.1
-np.uint16      0.1                                  uint16       0.1
-np.uint32      0.1                                  uint32       0.1
-np.uint64      0.1                                  uint64       0.1
-np.uint8       0.1                                  int8         0.1
-np.int16       0.1                                  int16        0.1
-np.int32       0.1                                  int32        0.1
-np.int64       0.1                                  int64        0.1
-np.float16     0.1
-np.float32     0.1                                  single       0.1
-np.float64     0.1                                  double       0.1
-np.complex64   0.1                                  single       0.1
-np.complex128  0.1                                  double       0.1
-np.str\_       0.1      np.uint32/16                char/uint32  0.1 [2]_
-np.bytes\_     0.1      np.bytes\_ or np.uint16     char         0.1 [3]_
-np.object\_    0.1                                  cell         0.1
-np.ndarray     0.1      [5]_ [6]_                   [5]_ [6]_    0.1 [5]_ [7]_
-np.matrix      0.1      [5]_                        [5]_         0.1 [5]_
-np.chararray   0.1      [5]_                        [5]_         0.1 [5]_
-np.recarray    0.1      structured np.ndarray       [5]_ [6]_    0.1 [5]_
-=============  =======  ==========================  ===========  =============
+===============  =======  ==========================  ===========  =============
+Python                                                MATLAB
+----------------------------------------------------  --------------------------
+Type             Version  Converted to                Class        Version
+===============  =======  ==========================  ===========  =============
+bool             0.1      np.bool\_ or np.uint8       logical      0.1 [1]_
+None             0.1      ``np.float64([])``          ``[]``       0.1
+int              0.1      np.int64                    int64        0.1
+float            0.1      np.float64                  double       0.1
+complex          0.1      np.complex128               double       0.1
+str              0.1      np.uint32/16                char         0.1 [2]_
+bytes            0.1      np.bytes\_ or np.uint16     char         0.1 [3]_
+bytearray        0.1      np.bytes\_ or np.uint16     char         0.1 [3]_
+list             0.1      np.object\_                 cell         0.1
+tuple            0.1      np.object\_                 cell         0.1
+set              0.1      np.object\_                 cell         0.1
+frozenset        0.1      np.object\_                 cell         0.1
+cl.deque         0.1      np.object\_                 cell         0.1
+dict             0.1                                  struct       0.1 [4]_
+np.bool\_        0.1                                  logical      0.1
+np.void          0.1
+np.uint8         0.1                                  uint8        0.1
+np.uint16        0.1                                  uint16       0.1
+np.uint32        0.1                                  uint32       0.1
+np.uint64        0.1                                  uint64       0.1
+np.uint8         0.1                                  int8         0.1
+np.int16         0.1                                  int16        0.1
+np.int32         0.1                                  int32        0.1
+np.int64         0.1                                  int64        0.1
+np.float16 [5]_  0.1
+np.float32       0.1                                  single       0.1
+np.float64       0.1                                  double       0.1
+np.complex64     0.1                                  single       0.1
+np.complex128    0.1                                  double       0.1
+np.str\_         0.1      np.uint32/16                char/uint32  0.1 [2]_
+np.bytes\_       0.1      np.bytes\_ or np.uint16     char         0.1 [3]_
+np.object\_      0.1                                  cell         0.1
+np.ndarray       0.1      [6]_ [7]_                   [6]_ [7]_    0.1 [6]_ [8]_
+np.matrix        0.1      [6]_                        [6]_         0.1 [6]_
+np.chararray     0.1      [5]_                        [6]_         0.1 [6]_
+np.recarray      0.1      structured np.ndarray       [6]_ [7]_    0.1 [6]_
+===============  =======  ==========================  ===========  =============
 
 .. [1] Depends on the selected options. Always ``np.uint8`` when doing
        MATLAB compatiblity, or if the option is explicitly set.
@@ -154,14 +154,17 @@ np.recarray    0.1      structured np.ndarray       [5]_ [6]_    0.1 [5]_
        stored as ``np.uint16`` in UTF-16 encoding. Otherwise, it is just
        written as ``np.bytes_``.
 .. [4] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
-.. [5] Container types are only supported if their underlying dtype is
+.. [5] ``np.float16`` are not supported for h5py versions before
+       ``2.2``.
+.. [6] Container types are only supported if their underlying dtype is
        supported. Data conversions are done based on its dtype.
-.. [6] Structured ``np.ndarray`` s (have fields in their dtypes) can be
+.. [7] Structured ``np.ndarray`` s (have fields in their dtypes) can be
        written as an HDF5 COMPOUND type or as an HDF5 Group with Datasets
        holding its fields (either the values directly, or as an HDF5
        Reference array to the values for the different elements of the
-       data).
-.. [7] Structured ``np.ndarray`` s with no elements, when written like a
+       data). Can only be written as an HDF5 COMPOUND type if none of
+       its field are of dtype ``'object'``.
+.. [8] Structured ``np.ndarray`` s with no elements, when written like a
        structure, will not be read back with the right dtypes for their
        fields (will all become 'object').
 
@@ -173,8 +176,8 @@ type they are read as.
 MATLAB Class     Version  Python Type
 ===============  =======  =================================
 logical          0.1      np.bool\_
-single           0.1      np.float32 or np.complex64 [8]_
-double           0.1      np.float64 or np.complex128 [8]_
+single           0.1      np.float32 or np.complex64 [9]_
+double           0.1      np.float64 or np.complex128 [9]_
 uint8            0.1      np.uint8
 uint16           0.1      np.uint16
 uint32           0.1      np.uint32
@@ -189,12 +192,35 @@ cell             0.1      np.object\_
 canonical empty  0.1      ``np.float64([])``
 ===============  =======  =================================
 
-.. [8] Depends on whether there is a complex part or not.
+.. [9] Depends on whether there is a complex part or not.
 
 
 Versions
 ========
 
+0.1.2. Bugfix release fixing the following bugs.
+       * Removed mistaken support for ``np.float16`` for h5py versions
+         before ``2.2`` since that was when support for it was
+         introduced.
+       * Structured ``np.ndarray`` where one or more fields is of the
+         ``'object'`` dtype can now be written without an error when
+         the ``structured_numpy_ndarray_as_struct`` option is not set.
+         They are written as an HDF5 Group, as if the option was set.
+       * Support for the ``'MATLAB_fields'`` Attribute for data types
+         that are structures in MATLAB has been added for when the
+         version of the h5py package being used is ``2.3`` or greater.
+         Support is still missing for earlier versions (this package
+         requires a minimum version of ``2.1``).
+       * The check for non-unicode string keys (``str`` in Python 3 and
+         ``unicode`` in Python 2) in the type ``dict`` is done right
+         before any changes are made to the HDF5 file instead of in the
+         middle so that no changes are applied if an invalid key is
+         present.
+       * HDF5 userblock set with the proper metadata for MATLAB support
+         right at the beginning of when data is being written to an HDF5
+         file instead of at the end, meaning the writing can crash and
+         the file will still be a valid MATLAB file.
+
 0.1.1. Bugfix release fixing the following bugs.
        * ``str`` is now written like ``numpy.str_`` instead of
          ``numpy.bytes_``.
diff --git a/doc/source/development.rst b/doc/source/development.rst
index 0673445..50d7bc2 100644
--- a/doc/source/development.rst
+++ b/doc/source/development.rst
@@ -77,12 +77,13 @@ Standing Bugs
   :py:attr:`Options.structured_numpy_ndarray_as_struct` is set, are not
   written in a way that the dtypes for the fields can be restored when
   it is read back from file.
-* The Attribute 'MATLAB_fields' is not currently set when writing
+* The Attribute 'MATLAB_fields' is supported for h5py version ``2.3``
+  and newer. But for older versions, it is not currently set when writing
   data that should be imported into MATLAB as structures, and is ignored
   when reading data from file. This is because the h5py package cannot
-  work with its format. If a structure with fields 'a' and 'cd' are
-  saved, the Attribute looks like the following when using the
-  ``h5dump`` utility::
+  work with its format in older versions. If a structure with fields 'a'
+  and 'cd' are saved, the Attribute looks like the following when using
+  the ``h5dump`` utility::
 
     ATTRIBUTE "MATLAB_fields" {
        DATATYPE  H5T_VLEN { H5T_STRING {
@@ -96,10 +97,28 @@ Standing Bugs
        (0): ("a"), ("c", "d")
        }
     }
-
+  
+  In h5py version ``2.3``, the Attribute is an array of variable length
+  arrays of single character ASCII numpy strings (vlen of ``'S1'``). It
+  is created like so::
+  
+    fields = ['a', 'cd']
+    dt = h5py.special_dtype(vlen=np.dtype('S1'))
+    fs = np.empty(shape=(len(fields),), dtype=dt)
+    for i, s in enumerate(fields):
+        fs[i] = np.array([c.encode('ascii') for c in s],
+                         dtype='S1')
+  
+  Then ``fs`` looks like::
+  
+    array([array([b'a'], dtype='|S1'),
+           array([b'c', b'd'], dtype='|S1']), dtype=object)
+  
   MATLAB doesn't strictly require this field, but supporting it will
-  help with reading/writing empty MATLAB structs. Would probably require
-  writiing a custom Cython or C function to fix this.
+  help with reading/writing empty MATLAB structs and not losing the
+  fields. Adding support for older verions of h5py would probably
+  require writing a custom Cython or C function, or porting some h5py
+  code.
 
 Features to Add
 ---------------
diff --git a/doc/source/hdf5storage.Marshallers.rst b/doc/source/hdf5storage.Marshallers.rst
index f51cb63..0b8248a 100644
--- a/doc/source/hdf5storage.Marshallers.rst
+++ b/doc/source/hdf5storage.Marshallers.rst
@@ -68,7 +68,7 @@ NumpyScalarArrayMarshaller
 
    .. autoinstanceattribute:: NumpyScalarArrayMarshaller.matlab_attributes
       :annotation: = {'H5PATH', 'MATLAB_class', 'MATLAB_empty',
-		      'MATLAB_int_decode'}
+		      'MATLAB_int_decode', 'MATLAB_fields'}
 
    .. autoinstanceattribute:: NumpyScalarArrayMarshaller.types
       :annotation: = [np.ndarray, np.matrix,
@@ -186,7 +186,7 @@ PythonDictMarshaller
       :annotation: = {'Python.Type', 'Python.Fields'}
 
    .. autoinstanceattribute:: PythonDictMarshaller.matlab_attributes
-      :annotation: = {'H5PATH', 'MATLAB_class'}
+      :annotation: = {'H5PATH', 'MATLAB_class', 'MATLAB_fields'}
 
    .. autoinstanceattribute:: PythonDictMarshaller.types
       :annotation: = [dict]
diff --git a/doc/source/introduction.rst b/doc/source/introduction.rst
index e4dacbe..1b6c4c9 100644
--- a/doc/source/introduction.rst
+++ b/doc/source/introduction.rst
@@ -175,7 +175,13 @@ Making The Data
 ---------------
 
 Make a ``dict`` containing many different types in it that we want to
-store to disk in an HDF5 file.
+store to disk in an HDF5 file. The initialization method depends on
+the Python version.
+
+Python 3
+^^^^^^^^
+
+The ``dict`` keys must be ``str`` (the unicode string type).
 
     >>> import numpy as np
     >>> import hdf5storage
@@ -198,6 +204,36 @@ store to disk in an HDF5 file.
     ...           'hh': np.bytes_(b'how many?'),
     ...           'ii': np.object_(['text', np.int8([1, -3, 0])])}}
 
+Python 2
+^^^^^^^^
+
+The same thing but in Python 2 where the ``dict`` keys must be
+``unicode``. The other datatypes are translated from the Python 3
+example appropriately. The rest of the examples on this page are run
+identically in Python 2 and 3, but the outputs are listed as is
+returned in Python 3.
+
+    >>> import numpy as np
+    >>> import hdf5storage
+    >>> a = {u'a': True,
+    ...      u'b': None,
+    ...      u'c': 2,
+    ...      u'd': -3.2,
+    ...      u'e': (1-2.3j),
+    ...      u'f': u'hello',
+    ...      u'g': 'goodbye',
+    ...      u'h': [u'list', u'of', u'stuff', [30, 2.3]],
+    ...      u'i': np.zeros(shape=(2,), dtype=[('bi', 'uint8')]),
+    ...      u'j':{u'aa': np.bool_(False),
+    ...            u'bb': np.uint8(4),
+    ...            u'cc': np.uint32([70, 8]),
+    ...            u'dd': np.int32([]),
+    ...            u'ee': np.float32([[3.3], [5.3e3]]),
+    ...            u'ff': np.complex128([[3.4, 3], [9+2j, 0]]),
+    ...            u'gg': np.array([u'one', u'two', u'three'], dtype='unicode'),
+    ...            u'hh': np.str_('how many?'),
+    ...            u'ii': np.object_([u'text', np.int8([1, -3, 0])])}}
+
 Using No Metadata
 -----------------
 
diff --git a/doc/source/storage_format.rst b/doc/source/storage_format.rst
index 451362e..a32110c 100644
--- a/doc/source/storage_format.rst
+++ b/doc/source/storage_format.rst
@@ -47,46 +47,46 @@ stored (Group or Dataset), what type/s it is converted to (no conversion
 if none are listed), as well as the first version of this package to
 support the datatype.
 
-=============  =======  ====================================  =====================
-Type           Version  Converted to                          Group or Dataset
-=============  =======  ====================================  =====================
-bool           0.1      np.bool\_ or np.uint8 [1]_            Dataset
-None           0.1      ``np.float64([])``                    Dataset
-int            0.1      np.int64                              Dataset
-float          0.1      np.float64                            Dataset
-complex        0.1      np.complex128                         Dataset
-str            0.1      np.uint32/16 [2]_                     Dataset
-bytes          0.1      np.bytes\_ or np.uint16 [3]_          Dataset
-bytearray      0.1      np.bytes\_ or np.uint16 [3]_          Dataset
-list           0.1      np.object\_                           Dataset
-tuple          0.1      np.object\_                           Dataset
-set            0.1      np.object\_                           Dataset
-frozenset      0.1      np.object\_                           Dataset
-cl.deque       0.1      np.object\_                           Dataset
-dict [4]_      0.1                                            Group
-np.bool\_      0.1      not or np.uint8 [1]_                  Dataset
-np.void        0.1                                            Dataset
-np.uint8       0.1                                            Dataset
-np.uint16      0.1                                            Dataset
-np.uint32      0.1                                            Dataset
-np.uint64      0.1                                            Dataset
-np.uint8       0.1                                            Dataset
-np.int16       0.1                                            Dataset
-np.int32       0.1                                            Dataset
-np.int64       0.1                                            Dataset
-np.float16     0.1                                            Dataset
-np.float32     0.1                                            Dataset
-np.float64     0.1                                            Dataset
-np.complex64   0.1                                            Dataset
-np.complex128  0.1                                            Dataset
-np.str\_       0.1      np.uint32/16 [2]_                     Dataset
-np.bytes\_     0.1      np.bytes\_ or np.uint16 [3]_          Dataset
-np.object\_    0.1                                            Dataset
-np.ndarray     0.1      not or Group of contents [5]_         Dataset or Group [5]_
-np.matrix      0.1      np.ndarray                            Dataset
-np.chararray   0.1      np.bytes\_ or np.uint16/32 [2]_ [3]_  Dataset
-np.recarray    0.1      structued np.ndarray [5]_             Dataset or Group [5]_
-=============  =======  ====================================  =====================
+===============  =======  ====================================  =====================
+Type             Version  Converted to                          Group or Dataset
+===============  =======  ====================================  =====================
+bool             0.1      np.bool\_ or np.uint8 [1]_            Dataset
+None             0.1      ``np.float64([])``                    Dataset
+int              0.1      np.int64                              Dataset
+float            0.1      np.float64                            Dataset
+complex          0.1      np.complex128                         Dataset
+str              0.1      np.uint32/16 [2]_                     Dataset
+bytes            0.1      np.bytes\_ or np.uint16 [3]_          Dataset
+bytearray        0.1      np.bytes\_ or np.uint16 [3]_          Dataset
+list             0.1      np.object\_                           Dataset
+tuple            0.1      np.object\_                           Dataset
+set              0.1      np.object\_                           Dataset
+frozenset        0.1      np.object\_                           Dataset
+cl.deque         0.1      np.object\_                           Dataset
+dict [4]_        0.1                                            Group
+np.bool\_        0.1      not or np.uint8 [1]_                  Dataset
+np.void          0.1                                            Dataset
+np.uint8         0.1                                            Dataset
+np.uint16        0.1                                            Dataset
+np.uint32        0.1                                            Dataset
+np.uint64        0.1                                            Dataset
+np.uint8         0.1                                            Dataset
+np.int16         0.1                                            Dataset
+np.int32         0.1                                            Dataset
+np.int64         0.1                                            Dataset
+np.float16 [5]_  0.1                                            Dataset
+np.float32       0.1                                            Dataset
+np.float64       0.1                                            Dataset
+np.complex64     0.1                                            Dataset
+np.complex128    0.1                                            Dataset
+np.str\_         0.1      np.uint32/16 [2]_                     Dataset
+np.bytes\_       0.1      np.bytes\_ or np.uint16 [3]_          Dataset
+np.object\_      0.1                                            Dataset
+np.ndarray       0.1      not or Group of contents [6]_         Dataset or Group [6]_
+np.matrix        0.1      np.ndarray                            Dataset
+np.chararray     0.1      np.bytes\_ or np.uint16/32 [2]_ [3]_  Dataset
+np.recarray      0.1      structured np.ndarray [6]_            Dataset or Group [6]_
+===============  =======  ====================================  =====================
 
 .. [1] Depends on the selected options. Always ``np.uint8`` when
        ``convert_bools_to_uint8 == True`` (set implicitly when
@@ -105,9 +105,12 @@ np.recarray    0.1      structued np.ndarray [5]_             Dataset or Group [
        ``np.uint16`` in UTF-16 encoding. Otherwise, it is just written
        as ``np.bytes_``.
 .. [4] All keys must be ``str`` in Python 3 or ``unicode`` in Python 2.
-.. [5] If it doesn't have any fields in its dtype or if
-       :py:attr:`Options.structured_numpy_ndarray_as_struct` is not set, it
-       is not converted and is written as is as a Dataset. Otherwise, it
+.. [5] ``np.float16`` are not supported for h5py versions before
+       ``2.2``.
+.. [6] If it doesn't have any fields in its dtype or if
+       :py:attr:`Options.structured_numpy_ndarray_as_struct` is not set
+       and none of its fields are of dtype ``'object'``, it is not
+       converted and is written as is as a Dataset. Otherwise, it
        is written as a Group with its the contents of its individual
        fields written as Datasets within the Group having the fields as
        names.
@@ -125,7 +128,7 @@ Attributes are used. The table below lists the Attributes that have
 definite values depending only on the particular Python datatype being
 stored. Then, the other attributes are detailed individually.
 
-.. note
+.. note::
 
    'Python.Type', 'Python.numpy.UnderlyingType', and 'MATLAB_class' are
    all ``np.bytes_``. 'MATLAB_int_decode' is a ``np.int64``.
@@ -141,9 +144,9 @@ None           'builtins.NoneType'  'float64'                    'double'
 int            'int'                'int64'                      'int64'
 float          'float'              'float64'                    'double'
 complex        'complex'            'complex128'                 'double'
-str            'str'                'str#' [6]_                  'char'              2
-bytes          'bytes'              'bytes#' [6]_                'char'              2
-bytearray      'bytearray'          'bytes#' [6]_                'char'              2
+str            'str'                'str#' [7]_                  'char'              2
+bytes          'bytes'              'bytes#' [7]_                'char'              2
+bytearray      'bytearray'          'bytes#' [7]_                'char'              2
 list           'list'               'object'                     'cell'
 tuple          'tuple'              'object'                     'cell'
 set            'set'                'object'                     'cell'
@@ -151,7 +154,7 @@ frozenset      'frozenset'          'object'                     'cell'
 cl.deque       'collections.deque'  'object'                     'cell'
 dict           'dict'                                            'struct'
 np.bool\_      'numpy.bool'         'bool'                       'logical'           1
-np.void        'numpy.void'         'void#' [6]_
+np.void        'numpy.void'         'void#' [7]_
 np.uint8       'numpy.uint8'        'uint8'                      'uint8'
 np.uint16      'numpy.uint16'       'uint16'                     'uint16'
 np.uint32      'numpy.uint32'       'uint32'                     'uint32'
@@ -165,25 +168,26 @@ np.float32     'numpy.float32'      'float32'                    'single'
 np.float64     'numpy.float64'      'float64'                    'double'
 np.complex64   'numpy.complex64'    'complex64'                  'single'
 np.complex128  'numpy.complex128'   'complex128'                 'double'
-np.str\_       'numpy.str\_'        'str#' [6]_                  'char' or 'uint32'  2 or 4 [6]_
-np.bytes\_     'numpy.bytes\_'      'bytes#' [6]_                'char'              2
+np.str\_       'numpy.str\_'        'str#' [7]_                  'char' or 'uint32'  2 or 4 [8]_
+np.bytes\_     'numpy.bytes\_'      'bytes#' [7]_                'char'              2
 np.object\_    'numpy.object\_'     'object'                     'cell'
-np.ndarray     'numpy.ndarray'      [8]_                         [8]_ [9]_
-np.matrix      'numpy.matrix'       [8]_                         [8]_
-np.chararray   'numpy.chararray'    [8]_                         'char' [8]_
-np.recarray    'numpy.recarray'     [8]_                         [8]_ [9]_
+np.ndarray     'numpy.ndarray'      [9]_                         [9]_ [10]_
+np.matrix      'numpy.matrix'       [9]_                         [9]_
+np.chararray   'numpy.chararray'    [9]_                         'char' [9]_
+np.recarray    'numpy.recarray'     [9]_                         [9]_ [10]_
 =============  ===================  ===========================  ==================  =================
 
-.. [6] '#' is replaced by the number of bits taken up by the string, or
+.. [7] '#' is replaced by the number of bits taken up by the string, or
        each string in the case that it is an array of strings. This is 8
        and 32 bits per character for ``np.bytes_`` and ``np.str_``
        respectively.
-.. [7] ``2`` if it is stored as ``np.uint16`` or ``4`` if ``np.uint32``.
-.. [8] The value that would be put in for a scalar of the same dtype is
+.. [8] ``2`` if it is stored as ``np.uint16`` or ``4`` if ``np.uint32``.
+.. [9] The value that would be put in for a scalar of the same dtype is
        used.
-.. [9] If it is structured (its dtype has fields) and
-       :py:attr:`Options.structured_numpy_ndarray_as_struct` is set, it is
-       set to 'struct' overriding anything else.
+.. [10] If it is structured (its dtype has fields),
+        :py:attr:`Options.structured_numpy_ndarray_as_struct` is set,
+        and none of its fields are of dtype ``'object'``; it is set to
+        ``'struct'`` overriding anything else.
 
 
 Python.Shape
@@ -255,13 +259,35 @@ MATLAB_fields
 
 MATLAB Attribute
 
-complicated array of string arrays not supported by h5py
+numpy array of vlen numpy arrays of ``'S1'``
+
+.. versionchanged:: 0.1.2
+   Support for this Attribute added. Was deleted upon writing and
+   ignored when reading before.
 
 For MATLAB structures, MATLAB sets this field to all of the field names
 of the structure. If this Attribute is missing, MATLAB does not seem to
-care. Trying to set it to a differently formatted array of strings that
-the h5py package can handle causes an error in MATLAB when the file is
-imported, so this package does not set this Attribute at all.
+care. Can only be set or read properly for h5py version ``2.3`` and
+newer. Trying to set it to a differently formatted array of strings that
+older versions of h5py can handle causes an error in MATLAB when the file
+is imported, so this package does not set this Attribute at all for h5py
+version before ``2.3``.
+  
+The Attribute is an array of variable length arrays of single character
+ASCII numpy strings (vlen of ``'S1'``). If there are two fields named
+``'a'`` and ``'cd'``, it is created like so::
+  
+  fields = ['a', 'cd']
+  dt = h5py.special_dtype(vlen=np.dtype('S1'))
+  fs = np.empty(shape=(len(fields),), dtype=dt)
+  for i, s in enumerate(fields):
+      fs[i] = np.array([c.encode('ascii') for c in s],
+                       dtype='S1')
+
+Then ``fs`` looks like::
+  
+  array([array([b'a'], dtype='|S1'),
+         array([b'c', b'd'], dtype='|S1']), dtype=object)
 
 
 Storage of Special Types
@@ -310,26 +336,29 @@ Structure Like Data
 
 When storing data that is MATLAB struct like (``dict`` or structured
 ``np.ndarray`` when
-:py:attr:`Options.structured_numpy_ndarray_as_struct` is set), it is
-stored as an HDF5 Group with its contents of its fields written inside
-of the Group. For single element data (``dict`` or structured
-``np.ndarray`` with only a single element), the fields are written to
-Datasets inside the Group. For multi-element data, the elements for
-each field are written in :py:attr:`Options.group_for_references` and
-an HDF5 Reference array to all of those elements is written as a Dataset
-under the field name in the Groups.
+:py:attr:`Options.structured_numpy_ndarray_as_struct` is set and none of
+its fields are of dtype ``'object'``), it is stored as an HDF5 Group
+with its contents of its fields written inside of the Group. For single
+element data (``dict`` or structured ``np.ndarray`` with only a single
+element), the fields are written to Datasets inside the Group. For
+multi-element data, the elements for each field are written in
+:py:attr:`Options.group_for_references` and an HDF5 Reference array to
+all of those elements is written as a Dataset under the field name in
+the Groups. Othewise, it is written as is as a Dataset that is an
+HDF5 COMPOUND type.
 
-.. note::
+
+.. warning::
 
    If it has no elements and
    :py:attr:`Options.structured_numpy_ndarray_as_struct` is set, it
    can't be read back from the file accurately. The dtype for all the
    fields will become 'object' instead of what they originally were.
 
-.. note::
+.. warning::
 
-   In Python 2, importing structured ``np.ndarray`` s if any of their
-   fields have characters outside of ASCII.
+   In Python 2, importing structured ``np.ndarray`` s will produce an
+   error if any of their fields have characters outside of ASCII.
 
 
 Optional Data Transformations
@@ -375,12 +404,14 @@ structured_numpy_ndarray_as_struct
 ``bool``
 
 Whether ``np.ndarray`` types (or things converted to them) should be
-written as structures/Groups if their dtype has fields. A dtype with
-fields looks like ``np.dtype([('a', np.uint16), ('b': np.float32)])``.
-If an array satisfies this criterion and the option is set, rather than
-writing the data as a single Dataset, it is written as a Group with the
-contents of the individual fields written as Datasets within it. This
-option is set to ``True`` implicitly by ``matlab_compatible``.
+written as structures/Groups if their dtype has fields as long as none
+of the fields' dtypes are ``'object'`` in which case this option is
+treated as if it were ``True``. A dtype with fields looks like
+``np.dtype([('a', np.uint16), ('b': np.float32)])``. If an array
+satisfies this criterion and the option is set, rather than writing the
+data as a single Dataset, it is written as a Group with the contents of
+the individual fields written as Datasets within it. This option is set
+to ``True`` implicitly by ``matlab_compatible``.
 
 make_at_least_2d
 ----------------
@@ -457,8 +488,8 @@ type they are read as if there is no Python metadata attached to them.
 MATLAB Class     Version  Python Type
 ===============  =======  =================================
 logical          0.1      np.bool\_
-single           0.1      np.float32 or np.complex64 [10]_
-double           0.1      np.float64 or np.complex128 [10]_
+single           0.1      np.float32 or np.complex64 [11]_
+double           0.1      np.float64 or np.complex128 [11]_
 uint8            0.1      np.uint8
 uint16           0.1      np.uint16
 uint32           0.1      np.uint32
@@ -473,4 +504,4 @@ cell             0.1      np.object\_
 canonical empty  0.1      ``np.float64([])``
 ===============  =======  =================================
 
-.. [10] Depends on whether there is a complex part or not.
+.. [11] Depends on whether there is a complex part or not.

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/python-hdf5storage.git



More information about the debian-science-commits mailing list