[h5py] 154/455: Finished user guide

Ghislain Vaillant ghisvail-guest at moszumanska.debian.org
Thu Jul 2 18:19:27 UTC 2015


This is an automated email from the git hooks/post-receive script.

ghisvail-guest pushed a commit to annotated tag 1.3.0
in repository h5py.

commit 30cdef04158dad7ea3830cab9745a5abeb5ed637
Author: andrewcollette <andrew.collette at gmail.com>
Date:   Sat Nov 15 04:56:37 2008 +0000

    Finished user guide
---
 README.txt                     |   2 +-
 docs/source/_static/h5py.css   |  15 +-
 docs/source/guide/datasets.rst | 320 ------------------------
 docs/source/guide/hl.rst       | 551 +++++++++++++++++++++++++++++++++++++++--
 docs/source/guide/index.rst    |   2 -
 docs/source/guide/quick.rst    |   2 +
 docs/source/guide/threads.rst  |  89 -------
 h5py/highlevel.py              |   2 +-
 setup.py                       |  13 +-
 9 files changed, 549 insertions(+), 447 deletions(-)

diff --git a/README.txt b/README.txt
index 467a18b..2c5298a 100644
--- a/README.txt
+++ b/README.txt
@@ -2,7 +2,7 @@ README for the "h5py" Python/HDF5 interface
 ===========================================
 Copyright (c) 2008 Andrew Collette
 
-Version 0.3.1
+Version 1.0.0
 
 * http://h5py.alfven.org        Main site, docs, quick-start guide
 * http://h5py.googlecode.com    Downloads, FAQ and bug tracker
diff --git a/docs/source/_static/h5py.css b/docs/source/_static/h5py.css
index 58ee017..cf5da6d 100644
--- a/docs/source/_static/h5py.css
+++ b/docs/source/_static/h5py.css
@@ -97,7 +97,7 @@ table.highlighttable td {
 cite, code, tt {
     font-family: 'Consolas', 'Deja Vu Sans Mono', 'Bitstream Vera Sans Mono', monospace;
     font-size: 0.95em;
-    letter-spacing: 0.01em;
+    /*letter-spacing: 0.01em;*/
 }
 
 hr {
@@ -106,9 +106,10 @@ hr {
 }
 
 tt {
-    background-color: #f2f2f2;
-    border-bottom: 1px solid #ddd;
-    color: #333;
+    /*background-color: #f2f2f2;
+    border-bottom: 1px solid #ddd;*/
+    font-weight: bold;
+    /*color: #333;*/
 }
 
 tt.descname {
@@ -116,11 +117,13 @@ tt.descname {
     font-weight: bold;
     font-size: 1.2em;
     border: 0;
+    letter-spacing: 0.01em;
 }
 
 tt.descclassname {
     background-color: transparent;
     border: 0;
+    letter-spacing: 0.01em;
 }
 
 tt.xref {
@@ -173,7 +176,7 @@ dd {
 
 dt:target,
 .highlight {
-    background-color: #fbe54e;
+    background-color: #eee; /*#fbe54e;*/
 }
 
 dl.glossary dt {
@@ -259,7 +262,7 @@ div.bodywrapper {
 }
 
 div.body a {
-    text-decoration: underline;
+    text-decoration: none;
 }
 
 div.sphinxsidebar {
diff --git a/docs/source/guide/datasets.rst b/docs/source/guide/datasets.rst
deleted file mode 100644
index 526a5f5..0000000
--- a/docs/source/guide/datasets.rst
+++ /dev/null
@@ -1,320 +0,0 @@
-.. _Datasets:
-
-**************
-Using Datasets
-**************
-
-Datasets are where most of the information in an HDF5 file resides.  Like
-NumPy arrays, they are homogenous collections of data elements, with an
-immutable datatype and (hyper)rectangular shape.  Unlike NumPy arrays, they
-support a variety of transparent storage features such as compression,
-error-detection, and chunked I/O.
-
-Metadata can be associated with an HDF5 dataset in the form of an "attribute".
-It's recommended that you use this scheme for any small bits of information
-you want to associate with the dataset.  For example, a descriptive title,
-digitizer settings, or data collection time are appropriate things to store
-as HDF5 attributes.
-
-
-Opening an existing dataset
-===========================
-
-Since datasets reside in groups, the best way to retrive a dataset is by
-indexing the group directly:
-
-    >>> dset = grp["Dataset Name"]
-
-You can also open a dataset by passing the group and name directly to the
-constructor:
-
-    >>> dset = Dataset(grp, "Dataset Name")
-
-No options can be specified when opening a dataset, as almost all properties
-of datasets are immutable.
-
-
-Creating a dataset
-==================
-
-There are two ways to explicitly create a dataset, with nearly identical
-syntax.  The recommended procedure is to use a method on the Group object in
-which the dataset will be stored:
-
-    >>> dset = grp.create_dataset("Dataset Name", ...options...)
-
-Or you can call the Dataset constructor.  When providing more than just the
-group and name, the constructor will try to create a new dataset:
-
-    >>> dset = Dataset(grp, "Dataset name", ...options...)
-
-Bear in mind that if an object of the same name already exists in the group,
-you will have to manually unlink it first:
-
-    >>> "Dataset Name" in grp
-    True
-    >>> del grp["Dataset name"]
-    >>> dset = grp.create_dataset("Dataset Name", ...options...)
-
-Logically, there are two ways to specify a dataset; you can tell HDF5 its
-shape and datatype explicitly, or you can provide an existing ndarray from
-which the shape, dtype and contents will be determined.  The following options
-are used to communicate this information.
-
-
-Arguments and options
----------------------
-
-All options below can be given to either the Dataset constructor or the
-Group method create_dataset.  They are listed in the order the arguments are
-taken for both methods.  Default values are in *italics*.
-
-*   **shape** = *None* or tuple(<dimensions>)
-
-    A Numpy-style shape tuple giving the dataset dimensions.  Required if
-    option **data** isn't provided.
-
-*   **dtype** = *None* or NumPy dtype
-
-    A NumPy dtype, or anything from which a dtype can be determined.
-    This sets the datatype.  If this is omitted, the dataset will
-    consist of single-precision floats, in native byte order ("=f4").
-
-*   **data** = *None* or ndarray
-
-    A NumPy array.  The dataset shape and dtype will be determined from
-    this array, and the dataset will be initialized to its contents.
-    Required if option **shape** isn't provided.
-
-*   **chunks** = *None* or tuple(<chunk dimensions>)
-
-    Manually set the HDF5 chunk size.
-
-    When using any of the following options like compression or error-
-    detection, the dataset is stored in chunked format, as small atomic
-    pieces of data on which the filters operate.  These chunks are then
-    indexed by B-trees.  Ordinarily h5py will guess a chunk value.  If you
-    know what you're doing, you can override that value here.
-
-*   **compression** = *None* or int(0-9)
-
-    Enable the use of GZIP compression, at the given integer level.  The
-    dataset will be stored in chunked format.
-
-*   **shuffle** = True / *False*
-
-    Enable the shuffle filter, possibly increasing the GZIP compression
-    ratio.  The dataset will be stored in chunked format.
-
-*   **fletcher32** = True / *False*
-
-    Enable Fletcher32 error-detection.  The dataset will be stored in
-    chunked format.
-
-*   **maxshape** = *None* or tuple(<dimensions>)
-
-    If provided, the dataset will be stored in a chunked and extendible fashion.
-    The value provided should be a tuple of integers indicating the maximum
-    size of each axis.  You can provide a value of "None" for any axis to
-    indicate that the maximum size of that dimension is unlimited.
-
-Automatic creation
-------------------
-
-If you've already got a NumPy array you want to store, you can let h5py guess
-these options for you.  Simply assign the array to a Group entry:
-
-    >>> arr = numpy.ones((100,100), dtype='=f8')
-    >>> my_group["MyDataset"] = arr
-
-The object you provide doesn't even have to be an ndarray; if it isn't, h5py
-will create an intermediate NumPy representation before storing it.
-The resulting dataset is stored contiguously, with no compression or chunking.
-
-.. note::
-    Arrays are auto-created using the NumPy ``asarray`` function.  This means
-    that if you try to create a dataset from a string, you'll get a *scalar*
-    dataset containing the string itself!  To get a char array, pass in
-    something like ``numpy.fromstring(mystring, '|S1')`` instead.
-
-
-Data Access and Slicing
-=======================
-
-A subset of the NumPy indexing techniques is supported, including the
-traditional extended-slice syntax, named-field access, and boolean arrays.
-Discrete coordinate selection are also supported via an special indexer class.
-
-Properties
-----------
-
-Like Numpy arrays, Dataset objects have attributes named "shape" and "dtype":
-
-    >>> dset.dtype
-    dtype('complex64')
-    >>> dset.shape
-    (4L, 5L)
-
-Slicing access
---------------
-
-The best way to get at data is to use the traditional NumPy extended-slicing
-syntax.   Slice specifications are translated directly to HDF5 *hyperslab*
-selections, and are are a fast and efficient way to access data in the file.
-The following slicing arguments are recognized:
-
-    * Numbers: anything that can be converted to a Python long
-    * Slice objects: please note negative values are not allowed
-    * Field names, in the case of compound data
-    * At most one ``Ellipsis`` (``...``) object
-
-Here are a few examples (output omitted)
-
-    >>> dset = f.create_dataset("MyDataset", data=numpy.ones((10,10,10),'=f8'))
-    >>> dset[0,0,0]
-    >>> dset[0,2:10,1:9:3]
-    >>> dset[0,...]
-    >>> dset[:,::2,5]
-
-Simple array broadcasting is also supported:
-
-    >>> dset[0]   # Equivalent to dset[0,...]
-
-For compound data, you can specify multiple field names alongside the
-numeric slices:
-
-    >>> dset["FieldA"]
-    >>> dset[0,:,4:5, "FieldA", "FieldB"]
-    >>> dset[0, ..., "FieldC"]
-
-Advanced indexing
------------------
-
-Boolean "mask" arrays can also be used to specify a selection.  The result of
-this operation is a 1-D array with elements arranged in the standard NumPy
-(C-style) order:
-
-    >>> arr = numpy.random.random((10,10))
-    >>> dset = f.create_dataset("MyDataset", data=arr)
-    >>> result = dset[arr > 0.5]
-
-If you have a set of discrete points you want to access, you may not want to go
-through the overhead of creating a boolean mask.  This is especially the case
-for large datasets, where even a byte-valued mask may not fit in memory.  You
-can pass a list of points to the dataset selector via a custom "CoordsList"
-instance:
-
-    >>> mycoords = [ (0,0), (3,4), (7,8), (3,5), (4,5) ]
-    >>> coords_list = CoordsList(mycoords)
-    >>> result = dset[coords_list]
-
-Like boolean-array indexing, the result is a 1-D array.  The order in which
-points are selected is preserved.
-
-.. note::
-    These two techniques rely on an HDF5 construct which explicitly enumerates the
-    points to be selected.  It's very flexible but most appropriate for 
-    reasonably-sized (or sparse) selections.  The coordinate list takes at
-    least 8*<rank> bytes per point, and may need to be internally copied.  For
-    example, it takes 40MB to express a 1-million point selection on a rank-3
-    array.  Be careful, especially with boolean masks.
-
-Value attribute and scalar datasets
------------------------------------
-
-HDF5 allows you to store "scalar" datasets.  These have the shape "()".  You
-can use the syntax ``dset[...]`` to recover the value as an 0-dimensional
-array.  Also, the special attribute ``value`` will return a scalar for an 0-dim
-array, and a full n-dimensional array for all other cases:
-
-    >>> f["ArrayDS"] = numpy.ones((2,2))
-    >>> f["ScalarDS"] = 1.0
-    >>> f["ArrayDS"].value
-    array([[ 1.,  1.],
-           [ 1.,  1.]])
-    >>> f["ScalarDS"].value
-    1.0
-
-Extending Datasets
-------------------
-
-If the dataset is created with the *maxshape* option set, you can later expand
-its size.  Simply call the *extend* method:
-
-    >>> dset = f.create_dataset("MyDataset", (5,5), maxshape=(None,None))
-    >>> dset.shape
-    (5, 5)
-    >>> dset.extend((15,20))
-    >>> dset.shape
-    (15, 20)
-
-More on Datatypes
-=================
-
-Storing compound data
----------------------
-
-You can store "compound" data (struct-like, using named fields) using the Numpy
-facility for compound data types.  For example, suppose we have data that takes
-the form of (temperature, voltage) pairs::
-
-    >>> import numpy
-    >>> mydtype = numpy.dtype([('temp','=f4'),('voltage','=f8')])
-    >>> dset = f.create_dataset("MyDataset", (20,30), mydtype)
-    >>> dset
-    Dataset "MyDataset": (20L, 30L) dtype([('temp', '<f4'), ('voltage', '<f8')])
-    
-These types may contain any supported type, and be arbitrarily nested.
-
-.. _supported:
-
-Supported types
------------------
-
-The HDF5 type system is mostly a superset of its NumPy equivalent.  The
-following are the NumPy types currently supported by the interface:
-
-    ========    ==========  ==========  ===============================
-    Datatype    NumPy kind  HDF5 class  Notes
-    ========    ==========  ==========  ===============================
-    Integer     i, u        INTEGER
-    Float       f           FLOAT
-    Complex     c           COMPOUND    Stored as an HDF5 struct
-    Array       V           ARRAY       NumPy array with "subdtype"
-    Opaque      V           OPAQUE      Stored as HDF5 fixed-length opaque
-    Compound    V           COMPOUND    May be arbitarily nested
-    String      S           STRING      Stored as HDF5 fixed-length C-style strings
-    ========    ==========  ==========  ===============================
-
-Byte order is always preserved.  The following additional features are known
-not to be supported:
-
-    * Read/write HDF5 variable-length (VLEN) data
-
-      No obvious way exists to handle variable-length data in NumPy.
-
-    * NumPy object types (dtype "O")
-
-      This could potentially be solved by pickling, but requires low-level
-      VLEN infrastructure.
-
-    * HDF5 enums
-
-      There's no NumPy dtype support for enums.  Enum data is read as plain
-      integer data.  However, the low-level conversion routine
-      ``h5t.py_create`` can create an HDF5 enum from a integer dtype and a
-      dictionary of names.
-    
-    * HDF5 "time" datatype
-
-      This datatype is deprecated, and has no close NumPy equivalent.
-
-    
-     
-
-
-
-
-
-
-
diff --git a/docs/source/guide/hl.rst b/docs/source/guide/hl.rst
index f60c832..3feb871 100644
--- a/docs/source/guide/hl.rst
+++ b/docs/source/guide/hl.rst
@@ -1,16 +1,180 @@
 
-*********
-Reference
-*********
+*************
+Documentation
+*************
 
 .. module:: h5py.highlevel
 
+The high-level interface is the most convenient method to talk to HDF5.  There
+are three main abstractions: files, groups, and datasets. Each is documented
+separately below.
+
+You may want to read the :ref:`quick start guide <quick>` to get a general
+overview.
+
+Everything useful in this module is automatically exported to the `h5py`
+package namespace; you can do::
+
+    >>> from h5py import *  # or from h5py import File, etc.
+
+General info
+============
+
+Paths in HDF5
+-------------
+
+HDF5 files are organized like a filesystem.  :class:`Group` objects work like
+directories; they can contain other groups, and :class:`Dataset` objects.  Like
+a POSIX filesystem, objects are specified by ``/``-separated names, with the
+root group ``/`` (represented by the :class:`File` class) at the base.
+
+Wherever a name or path is called for, it may be relative or absolute.
+Constructs like ``..`` (parent group) are allowed.
+
+Metadata
+--------
+
+Every object in HDF5 supports metadata in the form of "attributes", which are
+small, named bits of data.  :class:`Group`, :class:`Dataset` and even
+:class:`File` objects each carry a dictionary-like object which exposes this
+behavior, named ``<obj>.attrs``.  This is the correct way to store metadata
+in HDF5 files.
+
+Library configuration
+---------------------
+
+A few library options are available to change the behavior of the library.
+You can get a reference to the global library configuration object via the
+function ``h5py.get_config()``.  This object supports the following attributes:
+
+    **complex_names**
+        Set to a 2-tuple of strings (real, imag) to control how complex numbers
+        are saved.  The default is ('r','i').
+
+Threading
+---------
+
+H5py is now always thread-safe.
+
+
+Files
+=====
+
+To open an HDF5 file, just instantiate the File object directly::
+
+    >>> from h5py import File  # or import *
+    >>> file_obj = File('myfile.hdf5','r')
+
+Valid modes (like Python's file() modes) are:
+
+    ===  ================================================
+     r   Readonly, file must exist
+     r+  Read/write, file must exist
+     w   Create file, truncate if exists
+     w-  Create file, fail if exists
+     a   Read/write if exists, create otherwise (default)
+    ===  ================================================
+
+Like Python files, you must close the file when done::
+
+    >>> file_obj.close()
+
+File objects can also be used as "context managers" along with the new Python
+``with`` statement.  When used in a ``with`` block, they will be closed at
+the end of the block regardless of what exceptions have been raised::
+
+    >>> with File('myfile.hdf5', 'r') as file_obj:
+    ...    # do stuff with file_obj
+    ...
+    >>> # file_obj is closed at end of block
+
+.. note::
+
+    In addition to the methods and properties listed below, File objects also
+    have all the methods and properties of :class:`Group` objects.  In this
+    case the group in question is the HDF5 *root group* (``/``).
+
+Reference
+---------
+
+.. class:: File
+
+    Represents an HDF5 file on disk.
+
+    .. attribute:: name
+
+        HDF5 filename
+
+    .. attribute:: mode
+
+        Mode used to open file
+
+    .. method:: __init__(name, mode='a')
+        
+        Open or create an HDF5 file.
+
+    .. method:: close()
+
+        Close the file.  You MUST do this before quitting Python or data may
+        be lost.
+
+    .. method:: flush()
+
+        Ask the HDF5 library to flush its buffers for this file.
+
+
 Groups
 ======
 
+Groups are the container mechanism by which HDF5 files are organized.  From
+a Python perspective, they operate somewhat like dictionaries.  In this case
+the "keys" are the names of group entries, and the "values" are the entries
+themselves (:class:`Group` and :class:`Dataset`) objects.  Objects are
+retrieved from the file using the standard indexing notation::
+
+    >>> file_obj = File('myfile.hdf5')
+    >>> subgroup = file_obj['/subgroup']
+    >>> dset = subgroup['MyDataset']  # full name /subgroup/Mydataset
+
+Objects can be deleted from the file using the standard syntax::
+
+    >>> del subgroup["MyDataset"]
+
+However, new groups and datasets should generally be created using method calls
+like :meth:`create_group <Group.create_group>` or
+:meth:`create_dataset <Group.create_dataset>`.
+Assigning a name to an existing Group or Dataset
+(e.g. ``group['name'] = another_group``) will create a new link in the file
+pointing to that object.  Assigning dtypes and NumPy arrays results in
+different behavior; see :meth:`Group.__setitem__` for details.
+
+In addition, the following behavior approximates the Python dictionary API:
+
+    - Container syntax (``if name in group``)
+    - Iteration yields member names (``for name in group``)
+    - Length (``len(group)``)
+    - :meth:`listnames <Group.listnames>`
+    - :meth:`iternames <Group.iternames>`
+    - :meth:`listobjects <Group.listobjects>`
+    - :meth:`iterobjects <Group.iterobjects>`
+    - :meth:`listitems <Group.listitems>`
+    - :meth:`iteritems <Group.iteritems>`
+
+Reference
+---------
+
 .. class:: Group
 
-    .. method:: __getitem__(name)
+    .. attribute:: name
+
+        Full name of this group in the file (e.g. ``/grp/thisgroup``)
+
+    .. attribute:: attrs
+
+        Dictionary-like object which provides access to this group's
+        HDF5 attributes.  See :ref:`attributes` for details.
+
+    .. method:: __getitem__(name) -> Group or Dataset
 
         Open an object in this group.
 
@@ -20,19 +184,20 @@ Groups
 
         The action taken depends on the type of object assigned:
 
-        Named HDF5 object (Dataset, Group, Datatype)
+        **Named HDF5 object** (Dataset, Group, Datatype)
             A hard link is created in this group which points to the
             given object.
 
-        Numpy ndarray
+        **Numpy ndarray**
             The array is converted to a dataset object, with default
             settings (contiguous storage, etc.). See :meth:`create_dataset`
             for a more flexible way to do this.
 
-        Numpy dtype
-            Commit a copy of the datatype as a named datatype in the file.
+        **Numpy dtype**
+            Commit a copy of the datatype as a
+            :ref:`named datatype <named_types>` in the file.
 
-        Anything else
+        **Anything else**
             Attempt to convert it to an ndarray and store it.  Scalar
             values are stored as scalar datasets. Raise ValueError if we
             can't understand the resulting array dtype.
@@ -44,18 +209,6 @@ Groups
 
         Remove (unlink) this member.
 
-    .. method:: __len__
-
-        Number of group members
-
-    .. method:: __iter__
-
-        Yields the names of group members
-
-    .. method:: __contains__(name)
-
-        See if the given name is in this group.
-
     .. method:: create_group(name) -> Group
 
         Create a new HDF5 group.
@@ -137,7 +290,9 @@ Groups
             Destination.  Must be either Group or path.  If a Group object, it may
             be in a different file.
 
-    .. method:: visit(func)
+        **Only available with HDF5 1.8.X**
+
+    .. method:: visit(func) -> None or return value from func
 
         Recursively iterate a callable over objects in this group.
 
@@ -145,7 +300,7 @@ Groups
         will be called exactly once for each link in this group and every
         group below it. Your callable must conform to the signature::
 
-            func(<member name>) => <None or return value>
+            func(<member name>) -> <None or return value>
 
         Returning None continues iteration, returning anything else stops
         and immediately returns that value from the visit method.  No
@@ -160,7 +315,7 @@ Groups
 
         **Only available with HDF5 1.8.X.**
 
-    .. method:: visititems(func)
+    .. method:: visititems(func) -> None or return value from func
 
         Recursively visit names and objects in this group and subgroups.
 
@@ -168,7 +323,7 @@ Groups
         will be called exactly once for each link in this group and every
         group below it. Your callable must conform to the signature::
 
-            func(<member name>, <object>) => <None or return value>
+            func(<member name>, <object>) -> <None or return value>
 
         Returning None continues iteration, returning anything else stops
         and immediately returns that value from the visit method.  No
@@ -187,6 +342,18 @@ Groups
 
         **Only available with HDF5 1.8.X.**
 
+    .. method:: __len__
+
+        Number of group members
+
+    .. method:: __iter__
+
+        Yields the names of group members
+
+    .. method:: __contains__(name)
+
+        See if the given name is in this group.
+
     .. method:: listnames
 
         Get a list of member names
@@ -212,9 +379,343 @@ Groups
         Get an iterator over (name, object) pairs for the members of this group.
 
 
+Datasets
+========
+
+Datasets are where most of the information in an HDF5 file resides.  Like
+NumPy arrays, they are homogenous collections of data elements, with an
+immutable datatype and (hyper)rectangular shape.  Unlike NumPy arrays, they
+support a variety of transparent storage features such as compression,
+error-detection, and chunked I/O.
+
+Metadata can be associated with an HDF5 dataset in the form of an "attribute".
+It's recommended that you use this scheme for any small bits of information
+you want to associate with the dataset.  For example, a descriptive title,
+digitizer settings, or data collection time are appropriate things to store
+as HDF5 attributes.
+
+Datasets are created using either :meth:`Group.create_dataset` or
+:meth:`Group.require_dataset`.  Existing datasets should be retrieved using
+the group indexing syntax (``dset = group["name"]``). Calling the constructor
+directly is not recommended.
+
+A subset of the NumPy indexing techniques is supported, including the
+traditional extended-slice syntax, named-field access, and boolean arrays.
+Discrete coordinate selection are also supported via an special indexer class.
+
+Properties
+----------
+
+Like Numpy arrays, Dataset objects have attributes named "shape" and "dtype":
+
+    >>> dset.dtype
+    dtype('complex64')
+    >>> dset.shape
+    (4L, 5L)
+
+.. _slicing_access:
+
+Slicing access
+--------------
+
+The best way to get at data is to use the traditional NumPy extended-slicing
+syntax.   Slice specifications are translated directly to HDF5 *hyperslab*
+selections, and are are a fast and efficient way to access data in the file.
+The following slicing arguments are recognized:
+
+    * Numbers: anything that can be converted to a Python long
+    * Slice objects: please note negative values are not allowed
+    * Field names, in the case of compound data
+    * At most one ``Ellipsis`` (``...``) object
+
+Here are a few examples (output omitted)
+
+    >>> dset = f.create_dataset("MyDataset", data=numpy.ones((10,10,10),'=f8'))
+    >>> dset[0,0,0]
+    >>> dset[0,2:10,1:9:3]
+    >>> dset[0,...]
+    >>> dset[:,::2,5]
+
+Simple array broadcasting is also supported:
+
+    >>> dset[0]   # Equivalent to dset[0,...]
+
+For compound data, you can specify multiple field names alongside the
+numeric slices:
+
+    >>> dset["FieldA"]
+    >>> dset[0,:,4:5, "FieldA", "FieldB"]
+    >>> dset[0, ..., "FieldC"]
+
+Advanced indexing
+-----------------
+
+Boolean "mask" arrays can also be used to specify a selection.  The result of
+this operation is a 1-D array with elements arranged in the standard NumPy
+(C-style) order:
+
+    >>> arr = numpy.random.random((10,10))
+    >>> dset = f.create_dataset("MyDataset", data=arr)
+    >>> result = dset[arr > 0.5]
+
+If you have a set of discrete points you want to access, you may not want to go
+through the overhead of creating a boolean mask.  This is especially the case
+for large datasets, where even a byte-valued mask may not fit in memory.  You
+can pass a list of points to the dataset selector via a custom "CoordsList"
+instance:
+
+    >>> mycoords = [ (0,0), (3,4), (7,8), (3,5), (4,5) ]
+    >>> coords_list = CoordsList(mycoords)
+    >>> result = dset[coords_list]
+
+Like boolean-array indexing, the result is a 1-D array.  The order in which
+points are selected is preserved.
+
+.. note::
+    Boolean-mask and CoordsList indexing rely on an HDF5 construct which
+    explicitly enumerates the points to be selected.  It's very flexible but
+    most appropriate for 
+    reasonably-sized (or sparse) selections.  The coordinate list takes at
+    least 8*<rank> bytes per point, and may need to be internally copied.  For
+    example, it takes 40MB to express a 1-million point selection on a rank-3
+    array.  Be careful, especially with boolean masks.
+
+Value attribute and scalar datasets
+-----------------------------------
+
+HDF5 allows you to store "scalar" datasets.  These have the shape "()".  You
+can use the syntax ``dset[...]`` to recover the value as an 0-dimensional
+array.  Also, the special attribute ``value`` will return a scalar for an 0-dim
+array, and a full n-dimensional array for all other cases:
+
+    >>> f["ArrayDS"] = numpy.ones((2,2))
+    >>> f["ScalarDS"] = 1.0
+    >>> f["ArrayDS"].value
+    array([[ 1.,  1.],
+           [ 1.,  1.]])
+    >>> f["ScalarDS"].value
+    1.0
+
+Extending Datasets
+------------------
+
+If the dataset is created with the *maxshape* option set, you can later expand
+its size.  Simply call the *extend* method:
+
+    >>> dset = f.create_dataset("MyDataset", (5,5), maxshape=(None,None))
+    >>> dset.shape
+    (5, 5)
+    >>> dset.extend((15,20))
+    >>> dset.shape
+    (15, 20)
+
+Length and iteration
+--------------------
+
+As with NumPy arrays, the ``len()`` of a dataset is the length of the first
+axis.  Since Python's ``len`` is limited by the size of a C long, it's
+recommended you use the syntax ``dataset.len()`` instead of ``len(dataset)``
+on 32-bit platforms, if you expect the length of the first row to exceed 2**32.
+
+Iterating over a dataset iterates over the first axis.  As with NumPy arrays,
+mutating the yielded data has no effect.
+
+Reference
+---------
+
+.. class:: Dataset
+
+    Represents an HDF5 dataset.  All properties are read-only.
+
+    .. attribute:: name
+
+        Full name of this dataset in the file (e.g. ``/grp/MyDataset``)
+
+    .. attribute:: attrs
+
+        Provides access to HDF5 attributes; see :ref:`attributes`.
+
+    .. attribute:: shape
+
+        Numpy-style shape tuple with dataset dimensions
+
+    .. attribute:: dtype
+
+        Numpy dtype object representing the dataset type
+
+    .. attribute:: value
+
+        Special read-only property; for a regular dataset, it's equivalent to
+        dset[:] (an ndarray with all points), but for a scalar dataset, it's
+        a NumPy scalar instead of an 0-dimensional ndarray.
+
+    .. attribute:: chunks
+
+        Dataset chunk size, or None if chunked layout isn't used.
+
+    .. attribute:: compression
+
+        GZIP compression level, or None if compression isn't used.
+
+    .. attribute:: shuffle
+
+        Is the shuffle filter being used? (T/F)
+
+    .. attribute:: fletcher32
+
+        Is the fletcher32 filter (error detection) being used? (T/F)
+
+    .. attribute:: maxshape
+
+        Maximum allowed size of the dataset, as specified when it was created.
+
+    .. method:: __getitem__(*args) -> NumPy ndarray
+
+        Read a slice from the dataset.  See :ref:`slicing_access`.
+
+    .. method:: __setitem__(*args, val)
+
+        Write to the dataset.  See :ref:`slicing_access`.
+
+    .. method:: extend(shape)
+
+        Expand the size of the dataset to this new shape.  Must be compatible
+        with the *maxshape* as specified when the dataset was created.
+
+    .. method:: __len__
+
+        The length of the first axis in the dataset (TypeError if scalar).
+        This **does not work** on 32-bit platforms, if the axis in question
+        is larger than 2^32.  Use :meth:`len` instead.
+
+    .. method:: len()
+
+        The length of the first axis in the dataset (TypeError if scalar).
+        Works on all platforms.
+
+    .. method:: __iter__
+
+        Iterate over rows (first axis) in the dataset.  TypeError if scalar.
+
+
+.. _attributes:
+
+Attributes
+==========
+
+Groups and datasets can have small bits of named information attached to them.
+This is the official way to store metadata in HDF5.  Each of these objects
+has a small proxy object (:class:`AttributeManager`) attached to it as
+``<obj>.attrs``.  This dictionary-like object works like a :class:`Group`
+object, with the following differences:
+
+    - Entries may only be scalars and NumPy arrays
+    - Each attribute must be small (recommended < 64k for HDF5 1.6)
+    - No partial I/O (i.e. slicing) is allowed for arrays
+
+They support the same dictionary API as groups, including the following:
+
+    - Container syntax (``if name in obj.attrs``)
+    - Iteration yields member names (``for name in obj.attrs``)
+    - Number of attributes (``len(obj.attrs)``)
+    - :meth:`listnames <AttributeManager.listnames>`
+    - :meth:`iternames <AttributeManager.iternames>`
+    - :meth:`listobjects <AttributeManager.listobjects>`
+    - :meth:`iterobjects <AttributeManager.iterobjects>`
+    - :meth:`listitems <AttributeManager.listitems>`
+    - :meth:`iteritems <AttributeManager.iteritems>`
+
+Reference
+---------
+
+.. class:: AttributeManager
+
+    .. method:: __getitem__(name) -> NumPy scalar or ndarray
+
+        Retrieve an attribute given a string name.
+
+    .. method:: __setitem__(name, value)
+
+        Set an attribute.  Value must be convertible to a NumPy scalar
+        or array.
+
+    .. method:: __delitem__(name)
+
+        Delete an attribute.
+
+    .. method:: __len__
+
+        Number of attributes
+
+    .. method:: __iter__
+
+        Yields the names of attributes
+
+    .. method:: __contains__(name)
+
+        See if the given attribute is present
+
+    .. method:: listnames
+
+        Get a list of attribute names
+
+    .. method:: iternames
+
+        Get an iterator over attribute names
+
+    .. method:: listobjects
+
+        Get a list with all attribute values
+
+    .. method:: iterobjects
+
+        Get an iterator over attribute values
+
+    .. method:: listitems
+
+        Get an list of (name, value) pairs for all attributes.
+
+    .. method:: iteritems
+
+        Get an iterator over (name, value) pairs
+
+.. _named_types:
+
+Named types
+===========
+
+There is one last kind of object stored in an HDF5 file.  You can store
+datatypes (not associated with any dataset) in a group, simply by assigning
+a NumPy dtype to a name::
+
+    >>> group["name"] = numpy.dtype("<f8")
+
+and to get it back::
+
+    >>> named_type = group["name"]
+    >>> mytype = named_type.dtype
+
+Objects of this class are immutable and have no methods, just read-only
+properties.
+
+Reference
+---------
+
+.. class:: Datatype
+
+    .. attribute:: name
+
+        Full name of this object in the HDF5 file (e.g. ``/grp/MyType``)
+
+    .. attribute:: attrs
+
+        Attributes of this object (see :ref:`attributes section <attributes>`)
 
+    .. attribute:: dtype
 
+        NumPy dtype representation of this type
 
+    
 
 
 
diff --git a/docs/source/guide/index.rst b/docs/source/guide/index.rst
index fbd9413..6ed0f3a 100644
--- a/docs/source/guide/index.rst
+++ b/docs/source/guide/index.rst
@@ -10,9 +10,7 @@ User Guide
 
     build
     quick
-    datasets
     hl
-    threads
     licenses
 
 
diff --git a/docs/source/guide/quick.rst b/docs/source/guide/quick.rst
index 0ce2108..17b9e1c 100644
--- a/docs/source/guide/quick.rst
+++ b/docs/source/guide/quick.rst
@@ -1,3 +1,5 @@
+.. _quick:
+
 *****************
 Quick Start Guide
 *****************
diff --git a/docs/source/guide/threads.rst b/docs/source/guide/threads.rst
deleted file mode 100644
index 9cb8352..0000000
--- a/docs/source/guide/threads.rst
+++ /dev/null
@@ -1,89 +0,0 @@
-*********
-Threading
-*********
-
-Threading is an issue in h5py because HDF5 doesn't support thread-level
-concurrency.  Some versions of HDF5 are not even thread-safe.  The package
-tries to hide as much of these problems as possible using a combination of
-the GIL and Python-side reentrant locks.
-
-High-level
-----------
-
-The objects in h5py.highlevel (File, Dataset, etc) are always thread-safe.  You
-don't need to do any explicit locking, regardless of how the library is
-configured.
-
-Low-level
----------
-
-The low-level API (h5py.h5*) is also thread-safe, unless you use the
-experimental non-blocking option to compile h5py.  Then, and only then, you
-must acquire a global lock before calling into the low-level API.  This lock
-is available on the global configuration object at "h5py.config.lock".  The
-decorator "h5sync" in h5py.extras can wrap functions to do this automatically.
-
-
-Non-Blocking Routines
----------------------
-
-By default, all low-level HDF5 routines will lock the entire interpreter
-until they complete, even in the case of lengthy I/O operations.  This is
-unnecessarily restrictive, as it means even non-HDF5 threads cannot execute
-while a lengthy HDF5 read or write is in progress.
-
-When the package is compiled with the option ``--io-nonblock``, a few C methods
-involving I/O will release the global interpreter lock.  These methods always
-acquire the global HDF5 lock before yielding control to other threads.  While
-another thread seeking to acquire the HDF5 lock will block until the write
-completes, other Python threads (GUIs, pure computation threads, etc) will
-execute in a normal fashion.
-
-However, this defeats the thread safety provided by the GIL.  If another thread
-skips acquiring the HDF5 lock and blindly calls a low-level HDF5 routine while
-such I/O is in progress, the results are undefined.  In the worst case,
-irreversible data corruption and/or a crash of the interpreter is possible.
-
-**You must acquire the global lock (h5py.config.lock) when all the following
-are true:**
-
-    1. You are using the low-level (h5py.h5*) API
-    2. More than one thread is performing HDF5 operations
-    3. Non-blocking I/O (``--io-nonblock``) is enabled
-
-This is not an issue for the h5py.highlevel components (Dataset, Group,
-File objects, etc.) as they acquire the lock automatically.
-
-The following operations will release the GIL during I/O:
-    
-    * DatasetID.read
-    * DatasetID.write
-
-
-Customizing the lock type
--------------------------
-
-Applications that use h5py may have their own threading systems.  Since the
-h5py locks are acquired and released alongside application code, you can
-set the type of lock used internally by h5py.  The lock is stored as settable
-property "h5py.config.lock" and should be a lock instance (not a constructor)
-which provides the following methods:
-
-    * __enter__
-    * __exit__
-    * acquire
-    * release
-
-The default lock type is the native Python threading.RLock, but h5py makes no
-assumptions about the behavior or implementation of locks beyond reentrance and
-the existence of the four required methods above.
-
-It remains to be seen whether this is even necessary.  In future versions of
-h5py, this attribute may disappear or become non-writable.
-
-
-
-
-
-
-
diff --git a/h5py/highlevel.py b/h5py/highlevel.py
index 947731c..fb9cbc2 100644
--- a/h5py/highlevel.py
+++ b/h5py/highlevel.py
@@ -73,7 +73,7 @@ class LockableObject(object):
         Base class which provides rudimentary locking support.
     """
 
-    _lock = h5.get_phil()
+    _lock = threading.RLock()
 
 
 class HLObject(LockableObject):
diff --git a/setup.py b/setup.py
index 0a4c560..7c5d8c7 100644
--- a/setup.py
+++ b/setup.py
@@ -48,7 +48,7 @@ from distutils.cmd import Command
 
 # Basic package options
 NAME = 'h5py'
-VERSION = '0.4.0'
+VERSION = '1.0.0'
 MIN_NUMPY = '1.0.3'
 MIN_CYTHON = '0.9.8.1.1'
 SRC_PATH = 'h5py'      # Name of directory with .pyx files
@@ -60,6 +60,13 @@ MODULES = {16:  ['h5', 'h5f', 'h5g', 'h5s', 'h5t', 'h5d', 'h5a', 'h5p', 'h5z',
            18:  ['h5', 'h5f', 'h5g', 'h5s', 'h5t', 'h5d', 'h5a', 'h5p', 'h5z',
                  'h5i', 'h5r', 'h5fd', 'utils', 'h5o', 'h5l']}
 
+def version_check(vers, required):
+
+    def _tpl(istr):
+        return tuple(int(x) for x in istr.split('.'))
+
+    return _tpl(vers) >= _tpl(required)
+
 def fatal(instring, code=1):
     print >> sys.stderr, "Fatal: "+instring
     exit(code)
@@ -274,8 +281,8 @@ DEF H5PY_THREADS = %(THREADS)d  # Enable thread-safety and non-blocking reads
         except ImportError:
             fatal("Cython recompilation required, but Cython >=%s not installed." % MIN_CYTHON)
 
-        if Version.version < MIN_CYTHON:
-            fatal("Old Cython version detected; at least %s required" % MIN_CYTHON)
+        if not version_check(Version.version, MIN_CYTHON):
+            fatal("Old Cython %s version detected; at least %s required" % (Version.version, MIN_CYTHON))
 
         print "Running Cython (%s)..." % Version.version
         print "  API level: %d" % self.api

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/h5py.git



More information about the debian-science-commits mailing list