[numexpr] 01/01: Imported Upstream version 2.5
Antonio Valentino
a_valentino-guest at moszumanska.debian.org
Sat Feb 6 11:40:08 UTC 2016
This is an automated email from the git hooks/post-receive script.
a_valentino-guest pushed a commit to annotated tag upstream/2.5
in repository numexpr.
commit 20c1062dd5f894c9c6c737edb93559d6ae2a8be9
Author: Antonio Valentino <antonio.valentino at tiscali.it>
Date: Sat Feb 6 11:04:54 2016 +0000
Imported Upstream version 2.5
---
ANNOUNCE.rst | 9 ++--
AUTHORS.txt | 2 +
README.rst | 110 +++++++++++++++++++-----------------------
RELEASE_NOTES.rst | 12 ++++-
numexpr/cpuinfo.py | 26 +++++-----
numexpr/expressions.py | 31 +++++-------
numexpr/interp_body.cpp | 10 ++++
numexpr/interpreter.cpp | 52 +++++++++++++++-----
numexpr/module.cpp | 1 +
numexpr/module.hpp | 5 +-
numexpr/necompiler.py | 74 ++++++++++++++--------------
numexpr/opcodes.hpp | 39 +++++++++------
numexpr/tests/test_numexpr.py | 40 ++++++++++++++-
numexpr/version.py | 2 +-
14 files changed, 251 insertions(+), 162 deletions(-)
diff --git a/ANNOUNCE.rst b/ANNOUNCE.rst
index 664ee33..807878f 100644
--- a/ANNOUNCE.rst
+++ b/ANNOUNCE.rst
@@ -1,5 +1,5 @@
=========================
- Announcing Numexpr 2.4.6
+ Announcing Numexpr 2.5
=========================
Numexpr is a fast numerical expression evaluator for NumPy. With it,
@@ -21,9 +21,10 @@ don't want to adopt other solutions requiring more heavy dependencies.
What's new
==========
-This is a quick maintenance version that offers better handling of
-MSVC symbols (#168, Francesc Alted), as well as fising some
-UserWarnings in Solaris (#189, Graham Jones).
+In this version, a lock has been added so that numexpr can be called
+not from multithreaded apps. Mind that this does not prevent numexpr
+to use multiple cores internally. Also, a new min() and max()
+functions have been added. Thanks to contributors!
In case you want to know more in detail what has changed in this
version, see:
diff --git a/AUTHORS.txt b/AUTHORS.txt
index d727193..f43b249 100644
--- a/AUTHORS.txt
+++ b/AUTHORS.txt
@@ -20,3 +20,5 @@ enhancements.
Antonio Valentino contributed the port to Python 3.
Google Inc. contributed bug fixes.
+
+David Cox improved readability of the Readme.
diff --git a/README.rst b/README.rst
index 509bcfc..2c0a37c 100644
--- a/README.rst
+++ b/README.rst
@@ -25,7 +25,7 @@ expressions that operate on arrays (like "3*a+4*b") are accelerated
and use less memory than doing the same calculation in Python.
In addition, its multi-threaded capabilities can make use of all your
-cores, which may accelerate computations, most specially if they are
+cores -- which may accelerate computations, most specially if they are
not memory-bounded (e.g. those using transcendental functions).
Last but not least, numexpr can make use of Intel's VML (Vector Math
@@ -33,6 +33,34 @@ Library, normally integrated in its Math Kernel Library, or MKL).
This allows further acceleration of transcendent expressions.
+How Numexpr achieves high performance
+================================================
+
+The main reason why Numexpr achieves better performance than NumPy
+is that it avoids allocating memory for intermediate results. This
+results in better cache utilization and reduces memory access in
+general. Due to this, Numexpr works best with large arrays.
+
+Numexpr parses expressions into its own op-codes that are then used by
+an integrated computing virtual machine. The array operands are split
+into small chunks that easily fit in the cache of the CPU and passed to
+the virtual machine. The virtual machine then applies the operations
+on each chunk. It's worth noting that all temporaries and constants
+in the expression are also chunked.
+
+The result is that Numexpr can get the most of your machine computing
+capabilities for array-wise computations. Common speed-ups with regard
+to NumPy are usually between 0.95x (for very simple expressions
+like ’a + 1’) and 4x (for relatively complex ones like 'a*b-4.1*a > 2.5*b'),
+although much higher speed-ups can be achieved (up to 15x in some cases).
+
+Numexpr performs best on matrices that do not fit in CPU cache.
+In order to get a better idea on the different speed-ups
+that can be achieved on your platform, run the provided benchmarks.
+
+See more info about how Numexpr works in the `wiki <https://github.com/pydata/numexpr/wiki>`_.
+
+
Examples of use
===============
@@ -79,9 +107,9 @@ type inference rules, see below). Have this in mind when doing
estimations about the memory consumption during the computation of
your expressions.
-Also, the types in Numexpr conditions are somewhat stricter than those
-of Python. For instance, the only valid constants for booleans are
-`True` and `False`, and they are never automatically cast to integers.
+Also, the types in Numexpr conditions are somewhat more restrictive
+than those of Python. For instance, the only valid constants for booleans
+are `True` and `False`, and they are never automatically cast to integers.
Casting rules
@@ -128,7 +156,7 @@ Numexpr supports the set of operators listed below::
Supported functions
===================
-The next are the current supported set::
+Supported functions are listed below::
* where(bool, number1, number2): number
Number1 if the bool condition is true, number2 otherwise.
@@ -171,13 +199,13 @@ The next are the current supported set::
+ `contains()` only works with bytes strings, not unicode strings.
-More functions can be added if you need them.
+You may add additional functions as needed.
Supported reduction operations
==============================
-The next are the current supported set:
+The following reduction operations are currently supported::
* sum(number, axis=None): Sum of array elements over a given axis.
Negative axis are not supported.
@@ -185,6 +213,12 @@ The next are the current supported set:
* prod(number, axis=None): Product of array elements over a given
axis. Negative axis are not supported.
+ * min(number, axis=None): Minimum of array elements over a given
+ axis. Negative axis are not supported.
+
+ * max(number, axis=None): Maximum of array elements over a given
+ axis. Negative axis are not supported.
+
General routines
================
@@ -211,7 +245,7 @@ General routines
`set_vml_num_threads(nthreads)` to perform the parallel job with
VML instead. However, you should get very similar performance
with VML-optimized functions, and VML's parallelizer cannot deal
- with common expresions like `(x+1)*(x-2)`, while Numexpr's one
+ with common expressions like `(x+1)*(x-2)`, while Numexpr's one
can.
* detect_number_of_cores(): Detects the number of cores in the
@@ -222,9 +256,9 @@ Intel's VML specific support routines
=====================================
When compiled with Intel's VML (Vector Math Library), you will be able
-to use some additional functions for controlling its use. These are:
+to use some additional functions for controlling its use. These are outlined below::
-* set_vml_accuracy_mode(mode): Set the accuracy for VML operations.
+ * set_vml_accuracy_mode(mode): Set the accuracy for VML operations.
The `mode` parameter can take the values:
- 'low': Equivalent to VML_LA - low accuracy VML functions are called
@@ -234,66 +268,20 @@ The `mode` parameter can take the values:
It returns the previous mode.
This call is equivalent to the `vmlSetMode()` in the VML library.
-See:
-
-http://www.intel.com/software/products/mkl/docs/webhelp/vml/vml_DataTypesAccuracyModes.html
-for more info on the accuracy modes.
+::
-* set_vml_num_threads(nthreads): Suggests a maximum number of
- threads to be used in VML operations.
+ * set_vml_num_threads(nthreads): Suggests a maximum number of
+ threads to be used in VML operations.
This function is equivalent to the call
`mkl_domain_set_num_threads(nthreads, MKL_DOMAIN_VML)` in the MKL library.
-See:
-
-http://www.intel.com/software/products/mkl/docs/webhelp/support/functn_mkl_domain_set_num_threads.html
-for more info about it.
+See the Intel documentation on `VM Service Functions <https://software.intel.com/en-us/node/521831>`_ for more information.
* get_vml_version(): Get the VML/MKL library version.
-How Numexpr can achieve such a high performance?
-================================================
-
-The main reason why Numexpr achieves better performance than NumPy (or
-even than plain C code) is that it avoids the creation of whole
-temporaries for keeping intermediate results, so saving memory
-bandwidth (the main bottleneck in many computations in nowadays
-computers). Due to this, it works best with arrays that are large
-enough (typically larger than processor caches).
-
-Briefly, it works as follows. Numexpr parses the expression into its
-own op-codes, that will be used by the integrated computing virtual
-machine. Then, the array operands are split in small chunks (that
-easily fit in the cache of the CPU) and passed to the virtual
-machine. Then, the computational phase starts, and the virtual machine
-applies the op-code operations for each chunk, saving the outcome in
-the resulting array. It is worth noting that all the temporaries and
-constants in the expression are kept in the same small chunk sizes
-than the operand ones, avoiding additional memory (and most specially,
-memory bandwidth) waste.
-
-The result is that Numexpr can get the most of your machine computing
-capabilities for array-wise computations. Just to give you an idea of
-its performance, common speed-ups with regard to NumPy are usually
-between 0.95x (for very simple expressions, like ’a + 1’) and 4x (for
-relatively complex ones, like 'a*b-4.1*a > 2.5*b'), although much
-higher speed-ups can be achieved (up to 15x can be seen in not too
-esoteric expressions) because this depends on the kind of the
-operations and how many operands participates in the expression. Of
-course, Numexpr will perform better (in comparison with NumPy) with
-larger matrices, i.e. typically those that does not fit in the cache
-of your CPU. In order to get a better idea on the different speed-ups
-that can be achieved for your own platform, you may want to run the
-benchmarks in the directory bench/.
-
-See more info about how Numexpr works in:
-
-https://github.com/pydata/numexpr/wiki
-
-
Authors
=======
@@ -303,7 +291,7 @@ See AUTHORS.txt
License
=======
-Numexpr is distributed under the MIT license (see LICENSE.txt file).
+Numexpr is distributed under the MIT license.
diff --git a/RELEASE_NOTES.rst b/RELEASE_NOTES.rst
index c376846..398d010 100644
--- a/RELEASE_NOTES.rst
+++ b/RELEASE_NOTES.rst
@@ -1,7 +1,17 @@
======================================
- Release notes for Numexpr 2.4 series
+ Release notes for Numexpr 2.5 series
======================================
+Changes from 2.4.6 to 2.5
+=========================
+
+- Added locking for allowing the use of numexpr in multi-threaded
+ callers (this does not prevent numexpr to use multiple cores
+ simultaneously). (PR #199, Antoine Pitrou, PR #200, Jenn Olsen).
+
+- Added new min() and max() functions (PR #195, CJ Carey).
+
+
Changes from 2.4.5 to 2.4.6
===========================
diff --git a/numexpr/cpuinfo.py b/numexpr/cpuinfo.py
index 962ae9b..f11cf5f 100755
--- a/numexpr/cpuinfo.py
+++ b/numexpr/cpuinfo.py
@@ -38,7 +38,7 @@ def getoutput(cmd, successful_status=(0,), stacklevel=1):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
output, _ = p.communicate()
status = p.returncode
- except EnvironmentError, e:
+ except EnvironmentError as e:
warnings.warn(str(e), UserWarning, stacklevel=stacklevel)
return False, ''
if os.WIFEXITED(status) and os.WEXITSTATUS(status) in successful_status:
@@ -99,7 +99,7 @@ class CPUInfoBase(object):
return lambda func=self._try_call, attr=attr: func(attr)
else:
return lambda: None
- raise AttributeError, name
+ raise AttributeError(name)
def _getNCPUs(self):
return 1
@@ -128,7 +128,7 @@ class LinuxCPUInfo(CPUInfoBase):
info[0]['uname_m'] = output.strip()
try:
fo = open('/proc/cpuinfo')
- except EnvironmentError, e:
+ except EnvironmentError as e:
warnings.warn(str(e), UserWarning)
else:
for line in fo:
@@ -600,12 +600,16 @@ class Win32CPUInfo(CPUInfoBase):
# mean?
def __init__(self):
+ try:
+ import _winreg
+ except ImportError: # Python 3
+ import winreg as _winreg
+
if self.info is not None:
return
info = []
try:
#XXX: Bad style to use so long `try:...except:...`. Fix it!
- import _winreg
prgx = re.compile(r"family\s+(?P<FML>\d+)\s+model\s+(?P<MDL>\d+)" \
"\s+stepping\s+(?P<STP>\d+)", re.IGNORECASE)
@@ -636,8 +640,7 @@ class Win32CPUInfo(CPUInfoBase):
info[-1]["Model"] = int(srch.group("MDL"))
info[-1]["Stepping"] = int(srch.group("STP"))
except:
- print
- sys.exc_value, '(ignoring)'
+ print(sys.exc_value, '(ignoring)')
self.__class__.info = info
def _not_impl(self):
@@ -796,16 +799,13 @@ if __name__ == "__main__":
cpu.is_Intel()
cpu.is_Alpha()
- print
- 'CPU information:',
+ info = []
for name in dir(cpuinfo):
if name[0] == '_' and name[1] != '_':
r = getattr(cpu, name[1:])()
if r:
if r != 1:
- print
- '%s=%s' % (name[1:], r),
+ info.append('%s=%s' % (name[1:], r))
else:
- print
- name[1:],
- print
+ info.append(name[1:])
+ print('CPU information: ' + ' '.join(info))
diff --git a/numexpr/expressions.py b/numexpr/expressions.py
index 803a98e..635bfdf 100644
--- a/numexpr/expressions.py
+++ b/numexpr/expressions.py
@@ -231,22 +231,15 @@ def encode_axis(axis):
return RawNode(axis)
-def sum_func(a, axis=None):
- axis = encode_axis(axis)
- if isinstance(a, ConstantNode):
- return a
- if isinstance(a, (bool, int_, long_, float, double, complex)):
- a = ConstantNode(a)
- return FuncNode('sum', [a, axis], kind=a.astKind)
-
-
-def prod_func(a, axis=None):
- axis = encode_axis(axis)
- if isinstance(a, (bool, int_, long_, float, double, complex)):
- a = ConstantNode(a)
- if isinstance(a, ConstantNode):
- return a
- return FuncNode('prod', [a, axis], kind=a.astKind)
+def gen_reduce_axis_func(name):
+ def _func(a, axis=None):
+ axis = encode_axis(axis)
+ if isinstance(a, ConstantNode):
+ return a
+ if isinstance(a, (bool, int_, long_, float, double, complex)):
+ a = ConstantNode(a)
+ return FuncNode(name, [a, axis], kind=a.astKind)
+ return _func
@ophelper
@@ -373,8 +366,10 @@ functions = {
'complex': func(complex, 'complex'),
'conj': func(numpy.conj, 'complex'),
- 'sum': sum_func,
- 'prod': prod_func,
+ 'sum': gen_reduce_axis_func('sum'),
+ 'prod': gen_reduce_axis_func('prod'),
+ 'min': gen_reduce_axis_func('min'),
+ 'max': gen_reduce_axis_func('max'),
'contains': contains_func,
}
diff --git a/numexpr/interp_body.cpp b/numexpr/interp_body.cpp
index ec7e529..475a89f 100644
--- a/numexpr/interp_body.cpp
+++ b/numexpr/interp_body.cpp
@@ -456,6 +456,16 @@
ci_reduce = cr_reduce*c1i + ci_reduce*c1r;
cr_reduce = da);
+ case OP_MIN_IIN: VEC_ARG1(i_reduce = fmin(i_reduce, i1));
+ case OP_MIN_LLN: VEC_ARG1(l_reduce = fmin(l_reduce, l1));
+ case OP_MIN_FFN: VEC_ARG1(f_reduce = fmin(f_reduce, f1));
+ case OP_MIN_DDN: VEC_ARG1(d_reduce = fmin(d_reduce, d1));
+
+ case OP_MAX_IIN: VEC_ARG1(i_reduce = fmax(i_reduce, i1));
+ case OP_MAX_LLN: VEC_ARG1(l_reduce = fmax(l_reduce, l1));
+ case OP_MAX_FFN: VEC_ARG1(f_reduce = fmax(f_reduce, f1));
+ case OP_MAX_DDN: VEC_ARG1(d_reduce = fmax(d_reduce, d1));
+
default:
*pc_error = pc;
return -3;
diff --git a/numexpr/interpreter.cpp b/numexpr/interpreter.cpp
index 4d1576e..b622e7e 100644
--- a/numexpr/interpreter.cpp
+++ b/numexpr/interpreter.cpp
@@ -19,6 +19,13 @@
#include "interpreter.hpp"
#include "numexpr_object.hpp"
+#ifdef _MSC_VER
+/* Some missing symbols and functions for Win */
+#define fmax max
+#define fmin min
+#define INFINITY (DBL_MAX+DBL_MAX)
+#define NAN (INFINITY-INFINITY)
+#endif
#ifndef SIZE_MAX
#define SIZE_MAX ((size_t)-1)
@@ -46,7 +53,6 @@
#endif
-
using namespace std;
// Global state
@@ -691,13 +697,19 @@ vm_engine_iter_parallel(NpyIter *iter, const vm_params& params,
bool need_output_buffering, int *pc_error,
char **errmsg)
{
- int i;
+ int i, ret = -1;
npy_intp numblocks, taskfactor;
if (errmsg == NULL) {
return -1;
}
+ /* Ensure only one parallel job is running at a time (otherwise
+ the global th_params get corrupted). */
+ Py_BEGIN_ALLOW_THREADS;
+ pthread_mutex_lock(&gs.parallel_mutex);
+ Py_END_ALLOW_THREADS;
+
/* Populate parameters for worker threads */
NpyIter_GetIterIndexRange(iter, &th_params.start, &th_params.vlen);
/*
@@ -723,7 +735,7 @@ vm_engine_iter_parallel(NpyIter *iter, const vm_params& params,
for (; i > 0; --i) {
NpyIter_Deallocate(th_params.iter[i]);
}
- return -1;
+ goto end;
}
}
th_params.memsteps[0] = params.memsteps;
@@ -739,7 +751,7 @@ vm_engine_iter_parallel(NpyIter *iter, const vm_params& params,
for (i = 0; i < gs.nthreads; ++i) {
NpyIter_Deallocate(th_params.iter[i]);
}
- return -1;
+ goto end;
}
memcpy(th_params.memsteps[i], th_params.memsteps[0],
sizeof(npy_intp) *
@@ -778,7 +790,11 @@ vm_engine_iter_parallel(NpyIter *iter, const vm_params& params,
PyMem_Del(th_params.memsteps[i]);
}
- return th_params.ret_code;
+ ret = th_params.ret_code;
+
+end:
+ pthread_mutex_unlock(&gs.parallel_mutex);
+ return ret;
}
static int
@@ -1362,16 +1378,26 @@ NumExpr_run(NumExprObject *self, PyObject *args, PyObject *kwds)
/* Initialize the output to the reduction unit */
if (is_reduction) {
PyArrayObject *a = NpyIter_GetOperandArray(iter)[0];
- if (last_opcode(self->program) >= OP_SUM &&
- last_opcode(self->program) < OP_PROD) {
- PyObject *zero = PyLong_FromLong(0);
- PyArray_FillWithScalar(a, zero);
- Py_DECREF(zero);
+ PyObject *fill;
+ int op = last_opcode(self->program);
+ if (op < OP_PROD) {
+ /* sum identity is 0 */
+ fill = PyLong_FromLong(0);
+ } else if (op >= OP_PROD && op < OP_MIN) {
+ /* product identity is 1 */
+ fill = PyLong_FromLong(1);
+ } else if (PyArray_DESCR(a)->kind == 'f') {
+ /* floating point min/max identity is NaN */
+ fill = PyFloat_FromDouble(NAN);
+ } else if (op >= OP_MIN && op < OP_MAX) {
+ /* integer min identity */
+ fill = PyLong_FromLong(LONG_MAX);
} else {
- PyObject *one = PyLong_FromLong(1);
- PyArray_FillWithScalar(a, one);
- Py_DECREF(one);
+ /* integer max identity */
+ fill = PyLong_FromLong(LONG_MIN);
}
+ PyArray_FillWithScalar(a, fill);
+ Py_DECREF(fill);
}
/* Get the sizes of all the operands */
diff --git a/numexpr/module.cpp b/numexpr/module.cpp
index af9ce34..25a371d 100644
--- a/numexpr/module.cpp
+++ b/numexpr/module.cpp
@@ -187,6 +187,7 @@ int init_threads(void)
/* Initialize mutex and condition variable objects */
pthread_mutex_init(&gs.count_mutex, NULL);
+ pthread_mutex_init(&gs.parallel_mutex, NULL);
/* Barrier initialization */
pthread_mutex_init(&gs.count_threads_mutex, NULL);
diff --git a/numexpr/module.hpp b/numexpr/module.hpp
index 0234e12..b5397ea 100644
--- a/numexpr/module.hpp
+++ b/numexpr/module.hpp
@@ -27,12 +27,15 @@ struct global_state {
int force_serial; /* force serial code instead of parallel? */
int pid; /* the PID for this process */
- /* Syncronization variables */
+ /* Synchronization variables for threadpool state */
pthread_mutex_t count_mutex;
int count_threads;
pthread_mutex_t count_threads_mutex;
pthread_cond_t count_threads_cv;
+ /* Mutual exclusion for access to global thread params (th_params) */
+ pthread_mutex_t parallel_mutex;
+
global_state() {
nthreads = 1;
init_threads_done = 0;
diff --git a/numexpr/necompiler.py b/numexpr/necompiler.py
index ee11aec..89716e8 100644
--- a/numexpr/necompiler.py
+++ b/numexpr/necompiler.py
@@ -11,6 +11,7 @@
import __future__
import sys
import numpy
+import threading
from numexpr import interpreter, expressions, use_vml, is_cpu_amd_intel
from numexpr.utils import CacheDict
@@ -261,7 +262,8 @@ def stringToExpression(s, types, context):
def isReduction(ast):
- return ast.value.startswith(b'sum_') or ast.value.startswith(b'prod_')
+ prefixes = (b'sum_', b'prod_', b'min_', b'max_')
+ return any(ast.value.startswith(p) for p in prefixes)
def getInputOrder(ast, input_order=None):
@@ -684,6 +686,7 @@ def getExprNames(text, context):
_names_cache = CacheDict(256)
_numexpr_cache = CacheDict(256)
+evaluate_lock = threading.Lock()
def evaluate(ex, local_dict=None, global_dict=None,
out=None, order='K', casting='safe', **kwargs):
@@ -729,39 +732,40 @@ def evaluate(ex, local_dict=None, global_dict=None,
like float64 to float32, are allowed.
* 'unsafe' means any data conversions may be done.
"""
- if not isinstance(ex, (str, unicode)):
- raise ValueError("must specify expression as a string")
- # Get the names for this expression
- context = getContext(kwargs, frame_depth=1)
- expr_key = (ex, tuple(sorted(context.items())))
- if expr_key not in _names_cache:
- _names_cache[expr_key] = getExprNames(ex, context)
- names, ex_uses_vml = _names_cache[expr_key]
- # Get the arguments based on the names.
- call_frame = sys._getframe(1)
- if local_dict is None:
- local_dict = call_frame.f_locals
- if global_dict is None:
- global_dict = call_frame.f_globals
-
- arguments = []
- for name in names:
+ with evaluate_lock:
+ if not isinstance(ex, (str, unicode)):
+ raise ValueError("must specify expression as a string")
+ # Get the names for this expression
+ context = getContext(kwargs, frame_depth=1)
+ expr_key = (ex, tuple(sorted(context.items())))
+ if expr_key not in _names_cache:
+ _names_cache[expr_key] = getExprNames(ex, context)
+ names, ex_uses_vml = _names_cache[expr_key]
+ # Get the arguments based on the names.
+ call_frame = sys._getframe(1)
+ if local_dict is None:
+ local_dict = call_frame.f_locals
+ if global_dict is None:
+ global_dict = call_frame.f_globals
+
+ arguments = []
+ for name in names:
+ try:
+ a = local_dict[name]
+ except KeyError:
+ a = global_dict[name]
+ arguments.append(numpy.asarray(a))
+
+ # Create a signature
+ signature = [(name, getType(arg)) for (name, arg) in zip(names, arguments)]
+
+ # Look up numexpr if possible.
+ numexpr_key = expr_key + (tuple(signature),)
try:
- a = local_dict[name]
+ compiled_ex = _numexpr_cache[numexpr_key]
except KeyError:
- a = global_dict[name]
- arguments.append(numpy.asarray(a))
-
- # Create a signature
- signature = [(name, getType(arg)) for (name, arg) in zip(names, arguments)]
-
- # Look up numexpr if possible.
- numexpr_key = expr_key + (tuple(signature),)
- try:
- compiled_ex = _numexpr_cache[numexpr_key]
- except KeyError:
- compiled_ex = _numexpr_cache[numexpr_key] = \
- NumExpr(ex, signature, **context)
- kwargs = {'out': out, 'order': order, 'casting': casting,
- 'ex_uses_vml': ex_uses_vml}
- return compiled_ex(*arguments, **kwargs)
+ compiled_ex = _numexpr_cache[numexpr_key] = \
+ NumExpr(ex, signature, **context)
+ kwargs = {'out': out, 'order': order, 'casting': casting,
+ 'ex_uses_vml': ex_uses_vml}
+ return compiled_ex(*arguments, **kwargs)
diff --git a/numexpr/opcodes.hpp b/numexpr/opcodes.hpp
index 6d02459..086c98e 100644
--- a/numexpr/opcodes.hpp
+++ b/numexpr/opcodes.hpp
@@ -150,19 +150,30 @@ OPCODE(106, OP_REDUCTION, NULL, T0, T0, T0, T0)
/* Last argument in a reduction is the axis of the array the
reduction should be applied along. */
-OPCODE(107, OP_SUM, NULL, T0, T0, T0, T0)
-OPCODE(108, OP_SUM_IIN, "sum_iin", Ti, Ti, Tn, T0)
-OPCODE(109, OP_SUM_LLN, "sum_lln", Tl, Tl, Tn, T0)
-OPCODE(110, OP_SUM_FFN, "sum_ffn", Tf, Tf, Tn, T0)
-OPCODE(111, OP_SUM_DDN, "sum_ddn", Td, Td, Tn, T0)
-OPCODE(112, OP_SUM_CCN, "sum_ccn", Tc, Tc, Tn, T0)
-
-OPCODE(113, OP_PROD, NULL, T0, T0, T0, T0)
-OPCODE(114, OP_PROD_IIN, "prod_iin", Ti, Ti, Tn, T0)
-OPCODE(115, OP_PROD_LLN, "prod_lln", Tl, Tl, Tn, T0)
-OPCODE(116, OP_PROD_FFN, "prod_ffn", Tf, Tf, Tn, T0)
-OPCODE(117, OP_PROD_DDN, "prod_ddn", Td, Td, Tn, T0)
-OPCODE(118, OP_PROD_CCN, "prod_ccn", Tc, Tc, Tn, T0)
+OPCODE(107, OP_SUM_IIN, "sum_iin", Ti, Ti, Tn, T0)
+OPCODE(108, OP_SUM_LLN, "sum_lln", Tl, Tl, Tn, T0)
+OPCODE(109, OP_SUM_FFN, "sum_ffn", Tf, Tf, Tn, T0)
+OPCODE(110, OP_SUM_DDN, "sum_ddn", Td, Td, Tn, T0)
+OPCODE(111, OP_SUM_CCN, "sum_ccn", Tc, Tc, Tn, T0)
+
+OPCODE(112, OP_PROD, NULL, T0, T0, T0, T0)
+OPCODE(113, OP_PROD_IIN, "prod_iin", Ti, Ti, Tn, T0)
+OPCODE(114, OP_PROD_LLN, "prod_lln", Tl, Tl, Tn, T0)
+OPCODE(115, OP_PROD_FFN, "prod_ffn", Tf, Tf, Tn, T0)
+OPCODE(116, OP_PROD_DDN, "prod_ddn", Td, Td, Tn, T0)
+OPCODE(117, OP_PROD_CCN, "prod_ccn", Tc, Tc, Tn, T0)
+
+OPCODE(118, OP_MIN, NULL, T0, T0, T0, T0)
+OPCODE(119, OP_MIN_IIN, "min_iin", Ti, Ti, Tn, T0)
+OPCODE(120, OP_MIN_LLN, "min_lln", Tl, Tl, Tn, T0)
+OPCODE(121, OP_MIN_FFN, "min_ffn", Tf, Tf, Tn, T0)
+OPCODE(122, OP_MIN_DDN, "min_ddn", Td, Td, Tn, T0)
+
+OPCODE(123, OP_MAX, NULL, T0, T0, T0, T0)
+OPCODE(124, OP_MAX_IIN, "max_iin", Ti, Ti, Tn, T0)
+OPCODE(125, OP_MAX_LLN, "max_lln", Tl, Tl, Tn, T0)
+OPCODE(126, OP_MAX_FFN, "max_ffn", Tf, Tf, Tn, T0)
+OPCODE(127, OP_MAX_DDN, "max_ddn", Td, Td, Tn, T0)
/* Should be the last opcode */
-OPCODE(119, OP_END, NULL, T0, T0, T0, T0)
+OPCODE(128, OP_END, NULL, T0, T0, T0, T0)
diff --git a/numexpr/tests/test_numexpr.py b/numexpr/tests/test_numexpr.py
index 59fd19c..a971a95 100644
--- a/numexpr/tests/test_numexpr.py
+++ b/numexpr/tests/test_numexpr.py
@@ -93,21 +93,29 @@ class test_numexpr(TestCase):
(b'add_ddd', b't3', b't3', b'c2[2.0]'),
(b'prod_ddn', b'r0', b't3', 2)])
# Check that full reductions work.
- x = zeros(1e5) + .01 # checks issue #41
+ x = zeros(100000) + .01 # checks issue #41
assert_allclose(evaluate("sum(x+2,axis=None)"), sum(x + 2, axis=None))
assert_allclose(evaluate("sum(x+2,axis=0)"), sum(x + 2, axis=0))
assert_allclose(evaluate("prod(x,axis=0)"), prod(x, axis=0))
+ assert_allclose(evaluate("min(x)"), np.min(x))
+ assert_allclose(evaluate("max(x,axis=0)"), np.max(x, axis=0))
x = arange(10.0)
assert_allclose(evaluate("sum(x**2+2,axis=0)"), sum(x ** 2 + 2, axis=0))
assert_allclose(evaluate("prod(x**2+2,axis=0)"), prod(x ** 2 + 2, axis=0))
+ assert_allclose(evaluate("min(x**2+2,axis=0)"), np.min(x ** 2 + 2, axis=0))
+ assert_allclose(evaluate("max(x**2+2,axis=0)"), np.max(x ** 2 + 2, axis=0))
x = arange(100.0)
assert_allclose(evaluate("sum(x**2+2,axis=0)"), sum(x ** 2 + 2, axis=0))
assert_allclose(evaluate("prod(x-1,axis=0)"), prod(x - 1, axis=0))
+ assert_allclose(evaluate("min(x-1,axis=0)"), np.min(x - 1, axis=0))
+ assert_allclose(evaluate("max(x-1,axis=0)"), np.max(x - 1, axis=0))
x = linspace(0.1, 1.0, 2000)
assert_allclose(evaluate("sum(x**2+2,axis=0)"), sum(x ** 2 + 2, axis=0))
assert_allclose(evaluate("prod(x-1,axis=0)"), prod(x - 1, axis=0))
+ assert_allclose(evaluate("min(x-1,axis=0)"), np.min(x - 1, axis=0))
+ assert_allclose(evaluate("max(x-1,axis=0)"), np.max(x - 1, axis=0))
# Check that reductions along an axis work
y = arange(9.0).reshape(3, 3)
@@ -117,15 +125,25 @@ class test_numexpr(TestCase):
assert_allclose(evaluate("prod(y**2, axis=1)"), prod(y ** 2, axis=1))
assert_allclose(evaluate("prod(y**2, axis=0)"), prod(y ** 2, axis=0))
assert_allclose(evaluate("prod(y**2, axis=None)"), prod(y ** 2, axis=None))
+ assert_allclose(evaluate("min(y**2, axis=1)"), np.min(y ** 2, axis=1))
+ assert_allclose(evaluate("min(y**2, axis=0)"), np.min(y ** 2, axis=0))
+ assert_allclose(evaluate("min(y**2, axis=None)"), np.min(y ** 2, axis=None))
+ assert_allclose(evaluate("max(y**2, axis=1)"), np.max(y ** 2, axis=1))
+ assert_allclose(evaluate("max(y**2, axis=0)"), np.max(y ** 2, axis=0))
+ assert_allclose(evaluate("max(y**2, axis=None)"), np.max(y ** 2, axis=None))
# Check integers
x = arange(10.)
x = x.astype(int)
assert_allclose(evaluate("sum(x**2+2,axis=0)"), sum(x ** 2 + 2, axis=0))
assert_allclose(evaluate("prod(x**2+2,axis=0)"), prod(x ** 2 + 2, axis=0))
+ assert_allclose(evaluate("min(x**2+2,axis=0)"), np.min(x ** 2 + 2, axis=0))
+ assert_allclose(evaluate("max(x**2+2,axis=0)"), np.max(x ** 2 + 2, axis=0))
# Check longs
x = x.astype(long)
assert_allclose(evaluate("sum(x**2+2,axis=0)"), sum(x ** 2 + 2, axis=0))
assert_allclose(evaluate("prod(x**2+2,axis=0)"), prod(x ** 2 + 2, axis=0))
+ assert_allclose(evaluate("min(x**2+2,axis=0)"), np.min(x ** 2 + 2, axis=0))
+ assert_allclose(evaluate("max(x**2+2,axis=0)"), np.max(x ** 2 + 2, axis=0))
# Check complex
x = x + .1j
assert_allclose(evaluate("sum(x**2+2,axis=0)"), sum(x ** 2 + 2, axis=0))
@@ -841,6 +859,7 @@ class test_threading_config(TestCase):
# Case test for threads
class test_threading(TestCase):
+
def test_thread(self):
import threading
@@ -851,6 +870,25 @@ class test_threading(TestCase):
test = ThreadTest()
test.start()
+ test.join()
+
+ def test_multithread(self):
+ import threading
+
+ # Running evaluate() from multiple threads shouldn't crash
+ def work(n):
+ a = arange(n)
+ evaluate('a+a')
+
+ work(10) # warm compilation cache
+
+ nthreads = 30
+ threads = [threading.Thread(target=work, args=(1e5,))
+ for i in range(nthreads)]
+ for t in threads:
+ t.start()
+ for t in threads:
+ t.join()
# The worker function for the subprocess (needs to be here because Windows
diff --git a/numexpr/version.py b/numexpr/version.py
index 400d234..4393ee8 100644
--- a/numexpr/version.py
+++ b/numexpr/version.py
@@ -8,4 +8,4 @@
# rights to use.
####################################################################
-version = '2.4.6'
+version = '2.5'
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/numexpr.git
More information about the debian-science-commits
mailing list