[arrayfire] annotated tag v3.0beta created (now be64535)

Fri May 22 10:19:48 UTC 2015

This is an automated email from the git hooks/post-receive script.

ghisvail-guest pushed a change to annotated tag v3.0beta
in repository arrayfire.

        at  be64535   (tag)
   tagging  a4a26ce4eabe29a84e3c81a456d044854798c5d0 (commit)
 tagged by  Pavan Yalamanchili
        on  Mon Mar 16 02:30:07 2015 -0400

- Log -----------------------------------------------------------------
Beta release of 3.0

Brian Kloppenborg (48):
      CMakeLists.txt does not need to be executable.
      Add definition and directive to fix BOOST_INLINE not being defined on nvcc / CUDA < 6.5. Move FIND command for CUDA into backend/cuda/CMakeLists.txt
      Restore default compile state.
      Merge branch 'cuda_6_0_compile_fix'
      Use ArrayFire's CUDA_VERSION instead of CMake-specific detection.
      Add install steps.
      Move macro to the top of the file.
      Added description, renamed Requirements to Prerequisites, put clBLAS and clFFT subheadings as h4 under OpenCL backend, point clFFT/clBLAS to ArrayFire fork, formatting.
      Use standard package handling, search in both system and local paths for clBLAS, add CLBLAS_ROOT_DIR hint.
      Add install steps for CUDA and OpenCL. Add workaround for clFFT and clBLAS installers.
      Move OpenCL into backend CMakeLists.txt
      Prefer system libraries.
      Leave CUDA and OpenCL off by default.
      Adding install instructions for tests and examples.
      Only install .h and .hpp files. Exclude the .gitignore file from the installation step.
      Include .hpp files.
      Remove installation of tests and examples.
      Add CMake find script from arrayfire_benchmark.
      Fixes #107
      - Add blas as CPU backend dependency
      Fixed merge conflicts.
      Strip whitespace from OpenCL device information.
      Merge pull request #161 from easuter/devel
      Add FindSubversion.cmake to ensure Subversion_SVN_EXECUTABLE is set.
      Quote user-supplied paths. Search CMAKE_INSTALL_PREFIX last to prefer local installs.
      Fix signed to unsigned comparision warning.
      Fix signed vs. unsigned integer comparison.
      Bugfix for signed vs. unsigned comparison error.
      Fix no return from non-void function.
      Fix out-of-bounds memory access in array-based indexing.
      Fix pedantic compiler warning.
      Add function to test/find first non-zero dimension
      Return user-specified dimension.
      Install ArrayFire version file (version.h)
      Fix documentation source directory for install
      Merge with upstream, fix conflict.
      Merge pull request #470 from glehmann/cmake-config
      Remove execution bit.
      Create .tar.gz package for libraries and documentation using CPack
      Make example CMakeList standalone
      Merge branch 'devel' into cmake_packaging
      Fix missing asset definition.
      Package examples
      Fix incorrect reference to ArrayFire libraries from FIND script.
      Restore example naming convention and output directories.
      Add missing includes for stand-alone compliation of examples.
      Add copyright to header.
      Install example assets along with examples.

Casey Goodlett (1):
      Fix googletest build on other cmake build types

Gaëtan Lehmann (4):
      fix gtest build with ninja and simplify gtest external project
      fix gtest byproducts
      install arrayfire cmake configuration and version files
      display a warning when the assets can't be found in the source dir

Gallagher Pryor (2):
      fix for building gtest on systems w/ svn < v1.8
      Merge branch 'test' into 'master'

John Melonakos (1):
      Adding the license info for all source files in ArrayFire

Kumar Aatish (1):
      Changed OpenCL library search path order

Kyle Lutz (3):
      Add setUnique()/setUnion()/setIntersection() for OpenCL
      Use OpenCL error codes from Boost.Compute
      Reduce Boost.Compute header includes

Miguel Lloreda (1):
      fixed formatting

Nathan Jackson (4):
      Added mean calculation support for complex types.
      Added tests for computing the mean of complex values.
      Moved division function into math utilities.  Fixed mean function.
      Added variance interface and var_all implementation.

Pavan Yalamanchili (649):
      Renaming the helper files
      API changes for data transfer functions
      Create context using DEFAULT instead of CPU for OpenCL backend
      Data type changes to make the backends self contained
      Merge branch 'testHelper' into 'master'
      BUG fix: Fixing CalcBaseStride for greater than 2 dimensions
      Merge branch 'cpu_diff1' into 'master'
      Merge branch 'moddims'
      Fixing build errors from diff branch
      Fixing warnings from the new tests
      Cleanup of src/backend
      Merge branch 'transpose' into 'master'
      Fixing warnings in transpose tests
      Merge branch 'random'
      Fixing constructors for CUDA and OpenCL
      Cleaning up CMAKE files to automatically pick up source files
      CMAKE Fix: Explicitly state source extensions
      Fixing the formula for baseoffset.
      FEAT: Reductions for CPU added
      Updating the formula to work with negative and strided offsets
      Cleaning up diff kernels to have unified functions
      Merge branch 'diff_cuda' into 'master'
      Merge branch 'cuda_trs' into 'master'
      Merge branch 'consistant_headers'
      Changed the headers from earlier merges to be consistent as well
      Merge branch 'reduce'
      Merge branch 'rand'
      Adding explicit methods to modify ArrayInfo
      Removed all references to af_array inside src/backend/*/.
      Make all Array<T> constructors private
      FEAT: randu and randn for CUDA implemented
      Merge branch 'ocl_fixes' into 'master'
      Changing loop iterators to proper type
      A better way to handle the template specialization of size_t
      Merge remote-tracking branch 'origin/diff_opencl'
      BUG Fix: calcStride was accessing out of bounds.
      Fixing the bugfix to calcStrides
      Separate out reduce and transform functors
      Merge remote-tracking branch 'origin/ocl_transpose'
      Using dim_type in diff for opencl instead of size_t
      Removing trailing whitespaces
      Getting rid of .cu files in src/backend/cuda/kernel/
      Remove the unnecessary template instantiations
      Minor changes post merge to imageio.cpp
      Merge remote-tracking branch 'origin/cpu_histogram'
      Fixes for tests to compile properly by adding std:: prefix
      Style change to fix the compiler warnings on gcc 4.9
      Indexing support for CUDA backend
      Indexing support for OpenCL
      Removing unnecessary print from test/index.cpp
      Enabling diff tests and minor fix to work with indexing
      Enabling transpose tests and fixes to make tranpose pass the tests
      Merge remote-tracking branch 'origin/resize'
      Merge remote-tracking branch 'origin/cpu_morph'
      Merge remote-tracking branch 'origin/cpu_bilateral'
      Changing variable name to be consistent with the rest of the file
      Moving the functions in cuda/complex.hpp to global namespace
      Changing ops.hpp and */backend.hpp to work nicely with NVCC
      Removing unnecessary include file
      Reductions for CUDA backend
      Accum implementation for CPU backend
      Merge remote-tracking branch 'origin/info_helpers'
      Merge remote-tracking branch 'origin/cuda_morph'
      Bug fix to random number generation in CUDA
      Adding random number generation support to OpenCL backend
      Merge remote-tracking branch 'origin/ocl_morph'
      Merge remote-tracking branch 'origin/transform'
      Merge remote-tracking branch 'origin/blas'
      Merge remote-tracking branch 'origin/cuda_bilateral'
      Merge remote-tracking branch 'origin/ocl_bilateral'
      Merge branch 'ocl_morph_opt' into 'master'
      Merge remote-tracking branch 'origin/cuda_histogram'
      Merge remote-tracking branch 'origin/approx'
      Merge remote-tracking branch 'origin/ocl_histogram'
      Merge remote-tracking branch 'origin/random'
      Changing the location of the data repository
      Added Param and CParam structs that can be passed to the GPU
      Renaming helper functions and functors
      Unified print function for all backends
      Cleaning up header files
      Merge remote-tracking branch 'origin/header'
      Merge remote-tracking branch 'origin/master' into unify
      Merge remote-tracking branch 'origin/intel_histfix'
      Adding Param<T> to the remaining functions in CUDA backend
      Merge remote-tracking branch 'origin/unify'
      Merge branch 'ocl_dselector' into 'master'
      Adding a missing std:: in opencl/platform.cpp
      Adding __CL_ENABLE_EXCEPTIONS to the build process
      Merge branch 'origin/ocl_kernel_caching'
      Merge remote-tracking branch 'origin/tile'
      Add caching support for tile in xOpenCL
      Fixing copy paste error
      Introduced a common struct and build function for OpenCL kernels
      Changing beta == 0 instead of memsetting C to 0 in gemm
      Merge branch 'origin/blas_fix'
      Merge remote-tracking branch 'origin/unify'
      extended CATCHALL to include Type and Support errors
      AfError now supports line numbers and user specified af_errs
      Added *_NOT_SUPPORTED macros for each backend
      Added macro CUDA_CHECK that checks for cudaError and throws AfError
      change cldebug to debug_opencl
      Added POST_LAUNCH_CHECK to CUDA backend
      Added new error type --> ArgumentError
      Changed backend/reduce.cpp to include the new error mechanisms
      Changed backend/diff.cpp to use new error checks
      Changed backend/morph.cpp to use new error checks
      Changed SHOW_CL_ERROR() to CL_TO_AF_ERROR() in opencl backend
      Fixing a minor bug for ArgumentError
      Fixing the dimension checks for backend/morph.cpp
      Fixing the morph tests to check for correct errors
      Moving ARG_ASSERT to within try catch blocks
      Merge remote-tracking branch 'origin/error'
      Cleaning up a couple of lines
      buildProgram now accepts multiple source files
      added iscplx to backend/opencl/traits.hpp
      Reductions backend for OpenCL
      Merge branch 'clreduce' into 'master'
      Merge remote-tracking branch 'origin/cuda_device'
      Merge remote-tracking branch 'origin/matmul_fixes'
      Merge remote-tracking branch 'origin/reorder'
      Fixing issues with CUDA reductions
      Fixing typo in scan tests
      Cleaning up reductions code a bit more
      Scan algorithm for CUDA implemented
      Change to make sure autogenerated string headers are only included once
      Style clean up of OpenCL reduction code
      Scan algorithm for OpenCL backend
      Merge remote-tracking branch 'origin/shift'
      Merge remote-tracking branch 'origin/scan'
      Merge remote-tracking branch 'origin/gradient'
      Merge remote-tracking branch 'origin/cpu_medfilt'
      Updating gitignore to include unwanted emacs files
      Merge remote-tracking branch 'origin/cuda_medfilt'
      Merge remote-tracking branch 'origin/ocl_medfilt'
      Bug fix to gradient in CUDA and OpenCL backends
      Merge remote-tracking branch 'origin/intel_scan_fix'
      Cleaning up buggy strided dimensions in scan for CUDA and OpenCL
      Merge branch 'scan_bugfix'
      Launch configuration fix for AMD GPUs
      Merge remote-tracking branch 'origin/cpu_fft'
      Cleaning up backend/scan.cpp to include proper error checks
      Adding support for where for CPU backend
      Fixing corner cases in scan algorithm for CUDA and OpenCL backends
      Exceptions now display file names instead of function names
      Added a new function to create Array<T> from Param<T>
      Where implemented for CUDA backend
      Making the double buffering in OpenCL backend more explicit
      Change scan tests to run on OpenCL devices available on the system
      Adding support to create Array<T> from Param in OpenCL backend
      Changing where in CUDA backend to pass out by reference
      Style changes to OpenCL scan function
      Tentative support for where in OpenCL backend
      Merge remote-tracking branch 'origin/fft'
      Modifying FindclFFT.cmake to look in clFFT build directory
      Change required to suppress comparision warnings
      Removing unnecessary variadic templates
      Removing "static" from template specializations
      Changes to cuda and OpenCL backends to improve parallel compiles
      Merge remote-tracking branch 'origin/compile'
      Making sure ret in imageio is initialized before returning
      Adding dl libs explicitly to the OpenCL backend
      Merge branch 'cpu_where'
      Fix header locations to fix compilation in debug mode
      Merge remote-tracking branch 'origin/meanshift'
      Merge remote-tracking branch 'origin/pad'
      Merge remote-tracking branch 'origin/master' into where
      BUG: Fixed boundary checks for scan_first in CUDA and OpenCL
      Passing Params as references to where_* in OpenCL backend
      Merge remote-tracking branch 'origin/where'
      Added simple JIT kernel generation for OpenCL backend
      Updated the OpenCL backend to have simpler kernel name generation
      Kernel compilation and Launching added for OpenCL JIT backend
      Reorganizing files in src/backend
      Cleaning up jit.cpp
      Adding logical functions to the external API
      Adding the last few binary functions
      Adding cast function to OpenCL JIT backend
      Unary functions added to OpenCL JIT backend
      Adding new binary functions to OpenCL JIT backend
      Adding support for ScalarNodes in OpenCL JIT backend
      ndims() now returns atleast 1 instead of 0 from before.
      Added proper error checking to af_print
      Fixing the API of af_cast
      Adding cache support for OpenCL JIT kernels
      Merge branch 'bug_fixes' into 'master'
      Changing the implicit cast behavior to mimic c/c++
      Adding CUDA/CPU_NOT_SUPPORTED macros for elementary operations
      JIT kernel generation support for OpenCL backend
      Removing unnecessary member variables from BufferNode
      BUG fix in JIT kernel generation in OpenCL backend
      Merge branch 'jit'
      Change required to make blas compile on centos 6
      Changing the cpu blas to depend on CBLAS instead of blas
      Merge remote-tracking branch 'origin/blas_fix'
      Fixes for OpenCL backend for gcc 4.7.2
      Updating the README.md
      Merge remote-tracking branch 'origin/ocl_fix'
      Adding FindMKL.cmake to ArrayFire repo
      Merge branch 'rotate_fix' into 'master'
      Merge branch 'fft_fix' into 'master'
      Bug fix in speicalizations for max<cfloat> and max<cdouble>
      Enabling double precision support for JIT kernels
      Fixing typo in opencl/jit.cpp
      Fixing the initial value for max on complex numbers
      Merge branch 'sort' into 'master'
      Merge branch 'warning-fixes' into 'master'
      Merge branch 'random_fix' into 'master'
      Merge branch 'platform_fixes' into 'master'
      Update README.md to have better formatting.
      Merge branch 'conv' into 'master'
      Stripping end of line characters from README.md
      Removing unnecessary line from cuda/CMakeLists.txt
      First draft of CUDA JIT
      Added FindNVVM.cmake
      Adding libcuda as a dependency
      Changes to make nvvm code to compile and execute
      Making the child nodes decide the types when calling functions
      Removing untracked folder from the repository
      Removing untracked folder from the repository
      Adding support for CAST and COMPLEX operations in CUDA backend
      Adding back tests in `basic.cpp` for CUDA backend
      Merge branch 'sort_split' into 'master'
      Merge branch 'bilateral_fixes' into 'master'
      Merge branch 'compute_cmake_fix' into 'master'
      Merge branch 'subref_assign' into 'master'
      Merge remote-tracking branch 'origin/jit'
      Element wise support for CPU backend
      Merge branch 'header-files' into 'master'
      Merge remote-tracking branch 'origin/TNJ'
      CPU backend now uses std::shared_ptr for holding data
      Using boost::shared_ptr for reference counting in CUDA backend
      Renaming files in CUDA backend
      Adding support for weak copy in src/backend/*.cpp
      Merge branch 'ocl_cmake_changes' into 'master'
      Merge branch 'ref'
      Bug fix for OpenCL backend when creating empty Arrays
      Adding basic functions to the C++ API
      Merge branch 'cuda_limit' into 'master'
      BUG_FIX: bin2cpp now adds NULL character towards the end of string
      bin2cpp now adds newline for CUDA but does not for OpenCL
      Merge branch 'regions' into 'master'
      Adding the license file to the repo
      Merge remote-tracking branch 'origin/master'
      Updating README.md to include clone command and fftw dependency
      Updating the arrayfire_data repo URL
      Updating README.md
      Merge pull request #25 from arrayfire/cpp_tests
      Merge pull request #28 from arrayfire/nan_inf_fix
      Merge pull request #33 from arrayfire/sort_cpp
      Merge pull request #31 from arrayfire/cmake_cuda_compute
      Merge pull request #35 from arrayfire/api_changes
      Merge data.h and reduce.h into algorithm.h
      Moving constant, randu and randn into af/data.h
      Moving approx1 and approx2 to af/signal.h
      Merge pull request #38 from arrayfire/header
      Moving important utility functions from data.cpp to handle.hpp
      Cleaning up the sort functions
      Adding set functions for the CPU backend
      Merge pull request #42 from kylelutz/opencl-set
      Fixing the iterators for union and intersect in OpenCL backend
      Adding set operations for CUDA backend
      Reducing the memory footprint for set_intersect
      Style changes
      Merge pull request #43 from arrayfire/set
      Merge pull request #48 from arrayfire/seq
      Making arrayfire_data a submodule
      Scoping out unimplemented code
      Adding af_eval() and array::eval()
      Adding af_get_device and af_sync to all backends
      Merge pull request #56 from arrayfire/data
      Merge pull request #58 from arrayfire/eval
      Merge pull request #59 from arrayfire/win_fixes
      Merge pull request #64 from arrayfire/timer
      Merge pull request #65 from 9prady9/additional_api
      Fixing a bug with +=, -=, *=, /=
      BUG fix: evaluate array before assignment operator
      Cleaning up moddims
      Changing enum so it does not clash with functions
      Merge pull request #66 from pavanky/misc
      Update CONTRIBUTING.md
      Merge pull request #69 from arrayfire/devel
      Merge pull request #72 from kylelutz/opencl-error-codes
      Merge pull request #74 from bkloppenborg/cmake_install
      Merge pull request #76 from arrayfire/minor-fixes
      Fixing compilation warnings in CPU and CUDA backends
      Merge pull request #79 from pavanky/warnings
      Merge pull request #75 from arrayfire/devel
      Reorganizing helloworld.cpp
      Updating README.md
      Update README.md
      Merge pull request #85 from 9prady9/format_array_print
      Merge pull request #87 from arrayfire/devel
      Adding new functions to src/backend
      Adding new methods to af::array class
      Merge pull request #91 from shehzan10/unary-fix
      Updating the commit hash of test/data submodule
      Destroy temporary variables from binary.cpp
      Make sure functions are not being declared more than once in NVVM IR
      Fixing a bug in CUDA backend to reset flags properly
      Merge pull request #94 from 9prady9/win_cblas_fixes
      Merge pull request #96 from arrayfire/devel
      Update README.md
      Merge pull request #98 from pavanky/readme
      Merge pull request #97 from kaatish/ocl_cmake_changes
      Merge pull request #99 from mlloreda/patch-1
      Fixed formatting
      Merge pull request #104 from firemanphil/master
      Merge pull request #102 from gcasey/buildfixes
      Merge pull request #105 from shehzan10/devel
      Merge pull request #112 from shehzan10/cuda_build_fix
      Merge branch 'issue_107' of https://github.com/bkloppenborg/arrayfire into devel
      Merge pull request #116 from shehzan10/transpose_perf
      Fixed bugs in ScalarNode for CUDA and OpenCL JIT backends
      Unary math functions convert arrays to floating point arrays
      Merge pull request #124 from 9prady9/index_tests
      Bug fix for reductions in CUDA backend
      Bug fix to random number generation in CUDA backend
      Removing deprecated files from the repo
      Suppress them warnings
      Fixing leaks in CUDA JIT backend
      Fixing Leaks in OpenCL JIT backend
      Fixing Memory leaks in CPU TNJ
      Merge branch 'pavanky/jit_fixes' into bugfixes
      Merge pull request #131 from pavanky/bugfixes
      Moving src/array to src/frontend/cpp
      Adding support for binary functions with scalar inputs
      Removed s8. Changed b8 to be of type char
      cast to b8 now results in arrays made up of 1s or 0s
      Add a debug version of CU_CHECK
      updated math operations for all backends
      Merge remote-tracking branch 'origin/arith' into devel
      Fixing complex function support in arrayfire
      Adding data check functions: isNaN, isInf, iszero
      Unifying af_constant_c32/c64 into af_constant_complex
      Support for global reductions in CPU backend
      Merge pull request #150 from bkloppenborg/findarrayfire_fixes
      Merge pull request #157 from shehzan10/devel
      Support for global reductions in CUDA backend
      Global reduction support for OpenCL backend
      changing reduce_global --> reduce_all
      Merge remote-tracking branch 'pavanky/algos' into devel
      Reorganizing the directory structure
      Unified the way complex numbers are printed
      Merge pull request #163 from shehzan10/transform_linear
      Merge pull request #165 from shehzan10/devel
      PERF: improvements to random number genration in CPU backend
      Wrapping af_get functions in AF_CHECK macro
      PERF: improvements to random number generation in CUDA backend
      Merge pull request #170 from umar456/devel
      PERF: improvements to random number generation in OpenCL backend
      Merge pull request #173 from bkloppenborg/FindclFFTImprovements
      Merge pull request #174 from shehzan10/devel
      Merge pull request #175 from pentschev/fast
      Merge pull request #179 from pentschev/fast_return_fix
      Merge branch 'devel' into perf
      PERF: improvements to CUDA JIT when memory is linear
      PERF: improvements to OpenCL JIT when memory is linear
      EXAMPLE: Monte Carlo estimation of PI
      BUGFIX: in JIT for CUDA backend
      Merge branch 'devel' into ocl_win_fixes
      correctly adding USE_DOUBLE to OpenCL JIT
      Merge pull request #188 from arrayfire/ocl_win_fixes
      Fixing the commit id of test/data submodule
      Merge pull request #190 from shehzan10/devel
      Merge pull request #191 from 9prady9/ocl_dev_sort
      PERF: Added memory manager for CUDA backend
      PERF: Added memory manager for CPU backend
      Merge pull request #192 from shehzan10/join
      Merge pull request #200 from shehzan10/imageio_fixes
      Merge pull request #202 from shehzan10/devel
      Merge pull request #205 from shehzan10/sort_fixes
      BUG: Fix in memory manager for CUDA backend with multiple devices
      Changing new/delete to malloc/free for CPU backend
      Adding variable names for MAX_BUFFERS and MAX_BYTES
      Changing Array.data from cl::Buffer to cl::Buffer *
      Fixing memory leak inside af_print_array
      PERF: Added memory manager for OpenCL backend
      Adding C api calls for malloc and free
      Changing the error message for pinned memory alloc / free
      Merge branch 'devel' into memory
      Fixing typo / bug in implicit.cpp
      Merge pull request #209 from shehzan10/devel
      Merge pull request #219 from shehzan10/devel
      PERF: using cuda::mem{Alloc,Free} instead of cuda{Malloc,Free}
      PERF: Improvements to reductions in CUDA and OpenCL
      BUG: Fixed issues with binary operations with scalar on LHS
      Updating math_ptx submodule
      Adding abs support for complex numbers
      Merge pull request #227 from 9prady9/conv2d_perf_fixes
      Merge pull request #229 from shehzan10/devel
      Updating CONTRIBUTING.md
      Merge pull request #238 from shehzan10/devel
      Merge pull request #239 from bkloppenborg/devel
      BUG: Fixed issues with atan2 in CUDA and OpenCL backends
      TEST: Adding global reduction tests
      Changing af::af_cfloat to af::cfloat for C++ API
      Properly catching and returning errors from af_sort*
      TESTS: Adding tests for math functions
      TEST: Adding tests for binary functions
      FEAT: Adding support for hypot
      TEST: Adding tests for complex binary functions
      FEAT: Adding identity function for all backends
      BUG: Fixed problem in cast for OpenCL backend
      BUG: Fixed a problem when casting complex numbers
      FEAT: Adding diag for all backends
      BUG: Fixed memory leak in C++ API when doing indexing
      SubArrays now contain reference to shared_ptr instead of parent
      Merge pull request #252 from shehzan10/devel
      Fixing problems with isOwner() in all backends
      Adding support for casting seq to array
      Default constructor now creates array of size (0,0,0,0)
      Minor changes to API
      Merge pull request #258 from mcclanahoochie/osx_fixes
      BUGFIX: Hotfix for cast in opencl backend
      Adding proper checks to tests
      Merge pull request #270 from umar456/clean_ocl_morph
      Merge pull request #271 from umar456/osx_build
      Fixing compilation errors
      Merge pull request #276 from shehzan10/devel
      adding math constants to ArrayFire
      Changing api of few functions to match v2.1
      BUG: Fixed issues with metadata while indexing
      cleaning up bugs created by previous commit
      Remove warnings when running fft in OpenCL backend
      Initial commit wih gfor support
      Merge pull request #285 from shehzan10/devel
      Adding dimension checks for cplx2
      Binary functions in C API now have batchMode parameter
      binaryNode now accepts output dimension size
      Adding support for batch mode in all backends
      Merge remote-tracking branch 'upstream/devel' into gfor
      Adding proper error checking macros to src/api/c/index.cpp
      Adding batchFunc support for CPP bakend
      FEAT: Adding GFOR support with for indexing
      EXAMPLE: Adding vectorize example to arrayfire
      Changing batchMode to batch
      Merge pull request #303 from 9prady9/match_template
      Cleaning up error handling in src/api/c/
      Adding error messages when necessary for CPP API functions
      Adding bounds checks for index and assign
      Merge pull request #307 from shehzan10/devel
      Merge pull request #311 from shehzan10/devel
      Merge pull request #312 from 9prady9/perf_fixes
      Merge pull request #317 from shehzan10/devel
      Exposing ArrayFire OpenCL internals for interoperability
      Fixing compile issues in OSX when using af/opencl.h
      BUGFIX: Fixing GFOR bug during assign
      Cleaning up the error checking in api/c/binary.cpp
      BUGFIX: seq --> array inside GFOR creates batche array
      BUGFIX: Fixed OpenCL JIT bug when variables were going out of scope
      Fixing typo in ToNum()
      Merge pull request #334 from pentschev/devel
      Merge pull request #337 from pentschev/fix_windows_cuda_math
      Merge pull request #335 from pentschev/devel
      Fixing warnings in ORB implementation and tests
      BUGFIX: Fixing data access patterns in OpenCL backend for diag
      BUGFIX: Fixing data access patterns in OpenCL backend for identity
      Fixing commit id for test/data
      BUGFIX: dims() now gets dimensions properly after indexing
      BUGFIX: Fixing issues with indexing after JIT operation
      BUGFIX/FEAT: Adding support for more 4d indexing operations
      FEAT: Adding support for negative offsets from end in CPP API
      BUGFIX: Fixed memory leak in af_copy_array
      Merge branch 'sobel' of https://github.com/9prady9/arrayfire into devel
      Merge pull request #344 from pentschev/fix_windows_orb
      BUGFIX: Fixing indexing to support reverse indexing
      Merge pull request #354 from pentschev/orb_fixes
      BUGFIX: Assignment operators now properly implement copy on write
      TEST: Adding additional tests for CPP indexing
      TEST: Adding new tests for CPP assign operators
      FEAT: Added support for bitand, bitor and bitxor for all backends
      FEAT: Adding preliminary support for 64 bit integers
      FEAT: reorder, transpose, moddims support for 64 bit ints
      FEAT: Adding binary function support for 64 bit ints
      BUGFIX: for numeric operations on integer types in OpenCL backend
      FEAT: CUDA backend support for numerical operations on 64 bit ints
      BUGFIX: Enabling mod / rem for integer types
      BUGFIX: Changing % to mean remainder instead of modulus
      Cleaning up mod and rem for integer types
      FEAT: Adding bitshiftl, bitshiftr
      TEST: Adding tests for 64 bit ints and bit shift functions
      Compile fix for windows
      Merge pull request #360 from pentschev/fix_missing_deleter
      BUGFIX: Adding target triple for when generating NVVM IR
      Merge pull request #366 from pentschev/fix_fast_zerofeat
      Merge pull request #368 from shehzan10/devel
      Merge pull request #367 from arrayfire/cuda7
      Removing math_ptx submodule as a dependency
      Fixing dependency issues during ptx generation
      Bugfix: fixed improper caching when casting in CUDA backend
      Bugfix: fixed improper caching when casting in OpenCL backend
      Merge remote-tracking branch 'upstream/devel' into ptxgen
      Merge pull request #369 from shehzan10/devel
      Changing std::string inputs to be references
      Changing DeviceManager in OpenCL backend to use one context per device
      Fixing copy paste error in sobel kernels in OpenCL backend
      Cleaning up af::info for OpenCL backend
      Sanitizing af::array class and constructor
      BUG: Fixed problem with JIT caching in CUDA backend
      BUG: Fixed problem with JIT caching in OpenCL backend
      TEST: Adding priliminary test for JIT
      STYLE: Removing unnecessary include files
      Renaming tests in test/jit.cpp
      Hashing the kernel names for CUDA and OpenCL
      BUILD: auto generated PTX files are copied instead of renaming them
      Use decimal notation instead of hex for OpenCL JIT names
      BUGFIX: Enable double precision support properly in OpenCL backend
      BUGFIX: Fixing randu for complex numbers in OpenCL backend
      Cleaning up opencl/kernel/random.cl
      FEAT: Adding support for randu(.., b8)
      Disabling OpenCL CPU and Accelerator support for OSX
      Adding skeleton code for indexed min and max
      FEAT: Indexed min and max for CPU backend
      FEAT: Indexed min and max for CUDA backend
      Removing unnecessary files from OpenCL backend
      FEAT: Indexed min and max for OpenCL backend
      Reorganizing features.cpp
      Adding proper checks in src/api/c/gradient.cpp
      Bit operations now supported for scalar integers and bools
      BUG: Fixed kernel compile issues with ireduce_dim.cl
      BUG: Fixed typo in ireduce_dim.cl
      STYLE: Fixed typos in test/reduce.cpp
      TEST: Adding tests for indexed min and max
      Fixing issues with min and max on boolean arrays
      Merge pull request #387 from 9prady9/colorspace
      Merge pull request #389 from 9prady9/statistics
      FEAT: Adding flat for all backends
      TEST: Adding tests for flat
      Enable scalar(real, imag) in all backends
      Changing overloaded createHandle appropriate function names
      Moving AF_THROW(af_init()) inside try/catch blocks
      af_constant_complex does not use temporary variables anymore
      FEAT: constant(val,...) now accepts val from all types
      TEST: Adding tests for constants of various types
      Merge pull request #396 from 9prady9/histeq
      Merge pull request #395 from shehzan10/devel
      FEAT: Adding binary operations for each type
      BUGFIX: memcopy kernel was creating indices incorrectly
      Adding isLinear() to ArrayInfo
      PERF: moddims no longer performs a copy if Input is Linear
      Code clean up in FAST and ORB for all backends
      Cleaning up the CPP features class
      Cleaning up memory.cpp in cuda backend
      Reverting a dumb commit I made to the code
      Destroy af_array at the end of tests
      Changing the internal API
      Making assign exception safe
      Destroying af_arrays properly in reduce and scan tests
      Organizing the examples directory
      Adding back examples from arrayfire_examples repo
      FEAT: Adding gaussian kernel to all backends
      Merge pull request #416 from shehzan10/devel
      Merge remote-tracking branch 'upstream/devel' into examples
      Merge pull request #408 from 9prady9/perf_conv
      Merge remote-tracking branch 'upstream/devel' into examples
      Changing the API of seprable convolution to match 2.1
      Fixing the dimensions of separable convolution
      Fixing dim checks for separable convolve in CUDA and OpenCL backends
      Fixing convolve example
      Fixing the rainfall example
      FEAT: Adding "product" for all backends
      FEAT: Adding flip for all backends
      Enabling commented parts of integer.cpp and monte_carlo_options.cpp
      Merge pull request #419 from 9prady9/sep_conv_fixes
      Changing the order of dimensions for monte carlo example
      Merge branch 'examples' of github.com:arrayfire/arrayfire into examples
      BUGFIX: in moddims when input is a jit node
      Merge pull request #423 from umar456/docs
      Merge pull request #424 from umar456/gtest
      Merge pull request #425 from shehzan10/devel
      BUGFIX for cascaded indexing.
      TEST: Adding cascaded indexing tests
      TEST: Adding back commented out tests from flip
      Merge pull request #427 from 9prady9/hsv_rgb
      Merge branch 'devel' into docs
      Merge pull request #428 from 9prady9/colorspace
      Fixing path of arrayfire/assets
      Build docs when you docs is enabled and "make all" is used
      Merge pull request #430 from umar456/devel
      FEAT: Adding lookup
      Adding new instantiations for reductions
      STYLE: Making the function "where" more explicit in C API
      Changing the dimension checks for index in C APi
      EXAMPLES: All machine learning examples now compile
      BUGFIX: in ArrayIndex aka lookup for CUDA backend
      Merge pull request #432 from 9prady9/conv_changes
      BUGFIX, EXAMPLE, Fixing a mistake in mnist_common
      Merge pull request #433 from 9prady9/ocl_fix
      Adding deep belief net example to ArrayFire
      Changing neural network example to use batches and epochs
      Merge pull request #441 from 9prady9/lookup_fixes
      EXAMPLE: Cleaning up DBN and ANN examples
      Adding new functions matmulNT, matmulTN, matmulTT
      Cleaning up DBN example to use new matmul functions
      Adding RBM example for ArrayFire
      Merge pull request #449 from shehzan10/devel
      PERF: Break large JIT trees into smaller nodes
      Fixing test names in complex.cpp
      Merge pull request #456 from shehzan10/devel
      STYLE: Changing cast operations in all backends
      FEAT: filter in convolutions is cast to the accum type
      BUILD: Adding /usr/local/include and /usr/include to FindOpenCL
      Merge branch 'gtest-ninja' into devel
      Merge pull request #472 from umar456/clang
      Renaming logit to logistic_regression
      BUGFIX: corrected the dimensions passed to gemv for tranpose(A)
      BUGFIX: var and stdev now use the getFNSD from common.hpp
      EXAMPLE: Cleaning up rbm example
      Example: Naive bayes example now uses prior probabilities
      Merge pull request #475 from bkloppenborg/cmake_install
      BUILD :Changes to suppress warnings in tests
      Example: clean up logistic regression
      Example: Adding comments to naive bayes
      Example: Adding new example to demo perceptron
      PERF: Making the isLinear() to only look upto ndims()
      PERF: Perform an async copy when data is linear
      Merge pull request #491 from pentschev/example_harris
      Removing OPENCL_LIBRARIES from CLBLAS_LIBRARIES in FindCLBLAS.cmake
      Merge pull request #494 from bkloppenborg/cmake_packaging
      Linear indexing now flattens the arrays before the operation
      Changing the layout of the documentation
      cleaning up the groups structure
      Merge pull request #503 from shehzan10/devel
      Removing empty file reduce.h
      Minor tweaks to blas documentation
      Added documentation for reductions
      Adding doxygen briefs for image processing functions
      Function groups organized
      Adding the remaining documentation for functions in algorithm.h
      Added documentation for part of arith.h
      Merge pull request #507 from glehmann/assets-submodule-msg
      DOCS: documentation for statistics.h
      DOCS: Adding brief descriptions for all documented functions
      DOCS: Adding documentation for remaining functions in image.h
      DOCS: Remove src/api/c from header path
      DOCS: Fixing code in getting_started
      DOCS: Fixing the formatting in image.h
      DOCS: Adding documentation for all functions in arith.h
      Merge pull request #510 from 9prady9/signal_docs
      DOCS: Fixing warnings
      DOCS: Adding examples tab to the generated documentation
      DOCS: Adding documentation for device.h and array.h
      DOCS: Adding documentation for manip_mat in index.h
      DOCS: Adding documentation for data.h
      DOCS: Fixing documentation errors for arith functions
      DOCS: Adding documentation for arith and logical operators in array.h
      DOCS: Adding documentation for indexing operations
      DOCS: Fixing links in the documentation landing page
      DOCS: Adding download links for arrayfire

Peter Andreas Entschev (83):
      Bug fix in OpenCL scan for Intel.
      Added regions API and CUDA backend.
      Added regions CPU backend as not supported.
      Added regions OpenCL backend as not supported.
      Added unit tests for regions.
      Improved regions for CUDA, faster on large regions.
      Merge branch 'master' into regions
      Added OpenCL implementation of regions.
      Added CPU implementation of regions.
      Fixed template on CUDA regions.
      Fixed limits of double type on CUDA backend.
      Minor improvements to CPU regions.
      Added enum for regions connectivity type.
      Fixed regions unit tests
      Added struct af_features to store image features (aka keypoints).
      Added features class to manage af_features structs.
      Added FAST feature detector frontend.
      Added FAST feature detector CPU backend.
      Added FAST feature detector CUDA backend.
      Added FAST feature detector OpenCL backend.
      Added handlers for array type in features class.
      Fixed failing abs() call for int/unsigned types on CUDA backend of FAST.
      Added test reader for image input with array output.
      Added FAST unit tests.
      Merge remote-tracking branch 'upstream/devel' into fast
      Fixed FAST files to comply with new directory structure.
      Fixed data filename on FAST unit test.
      Updating test/data submodule
      Fixed wrong memory type allocation on OpenCL backend of FAST.
      FAST will return (af_)features instead of (af_)features *
      Merge pull request #323 from pavanky/ocl
      Changed CUDA convolve to avoid issues with constant memory.
      Added ORB API.
      Added ORB CPU backend.
      Added ORB CUDA backend.
      Changed thread variable names of some OpenCL functions.
      Added ORB OpenCL backend.
      Added test helper to read feature/descriptor test data.
      Added ORB unit tests.
      Added missing STL algorithm include to CUDA math.hpp.
      Added pi definition to fix ORB on Windows.
      Added check before freeing Gaussian filters in ORB OpenCL backend.
      ORB to return empty arrays ORB when no features exist.
      Merge pull request #355 from pavanky/index_fixes
      Added missing shared_ptr deleter in OpenCL backend.
      Added missing destructor for features class.
      Fixed FAST C++ API, added proper destructor calls.
      Fixed ORB C++ API to properly destroy af_features
      Fixed FAST memory leaks on CPU backend
      Fixed ORB memory leaks on CPU backend
      Fixed FAST memory leaks on CUDA backend
      Fixed ORB memory leaks on CUDA backend
      Fixed FAST memory leaks on OpenCL backend
      Fixed ORB memory leaks on OpenCL backend
      Added missing memory deletions on FAST unit test.
      Merge branch 'devel' into orb_fixes
      Passing argument as reference to features operator=
      Renamed feature.cpp to features.cpp to match class name
      Fixed FAST CUDA backend case when no features are found
      Fixed FAST CPU backend case when no features are found
      Added image blur argument to ORB API
      Added image blurring to ORB CPU backend
      Added image blurring to ORB CUDA backend
      Added image blurring to ORB OpenCL backend
      Added image blurring argument to ORB unit tests
      Improved ORB performance and memory usage on CUDA backend
      Improved FAST performance on CUDA backend
      Added argument to define length of edge discard in FAST.
      Changed the way FAST handles different datatypes internally
      Removed cudaMemset from FAST
      Added documentation for FAST
      Moved FAST description to docs directory.
      Fixed FAST edge assertions
      Added ORB documentation
      Updated test data
      Made FAST CPU results match CUDA results
      Made FAST OpenCL results match CUDA results
      Merge pull request #485 from pavanky/examples
      Added Harris corner detector example
      Fixed FAST type comparison mismatch warning
      Merge pull request #500 from bkloppenborg/cmake_packaging
      Merge pull request #504 from 9prady9/TemplateFunction
      Merge pull request #509 from pavanky/docs

Pradeep (228):
      af_transpose and corresponding unit tests
      Style changes in transpose
      Invalid arguments unit test for transpose
      BUG Fix: af_print in CUDA backend was directly using device pointer.
      CUDA backend transpose implementation
      Changes to transpose kernel
      changes to include all cl kernels in build
      type fixes for opencl
      af_print implementation for opencl backend
      opencl buffer read/write fixes in Array
      Added traits specilization for size_t
      opencl backend implementation for af_transpose
      Macro fix in transpose opencl backend
      Reverted dim_type to long long
      af_histogram cpu backend implementation
      Changed readTests helper function to accept multiple input arrays
      cpu implementations for af_[erode|dilate] and af_[erode3d|dilate3d]
      Added readImageTests and compareArraysRMSD helpers for unit tests
      af_bilateral API and cpu backend implementation
      Adding missing namespace qualifiers
      exp equation modification in bilateral cpu backend
      BugFix: type issue fix in compareArraysRMSD
      Disable unit test for int type in bilateral
      morph cuda backend
      cuda backend implementation for [af_erode3d|af_dilate3d]
      erode/dilate unit tests using images
      morph cuda kernel optimizations
      opencl morph implementation
      opencl backend for volumetric morphological ops
      cuda backend bilateral
      bilateral opencl backend implementation
      morph kernel optimization for supported window sizes
      histogram cuda backend
      histogram opencl backend
      Bug Fix in histogram cuda kernel
      min call in histogram kernel was ambiguous for intel compilers
      opencl device selection feature
      Replaced member funcs with friend funcs in opencl::DeviceManager
      added opencl kernel caching for transpose
      Removed cl.hpp from af/opencl.h
      Modified tranpose tests to run for all devices for opencl backend
      enabled ocl kernel caching in transform
      enabled ocl kernel caching for all exiting functions
      style changes in ocl transpose
      unify kernel params changes for opencl morph
      Renamed CL_FINISH to CL_DEBUG_FINISH
      unify kernel param changes to opencl bilateral
      unify kernel param changes to opencl histogram
      corrected typo in cpu_err header
      Proper error handling added to transpose
      Proper error handling added to erode/dilate
      added error handling for bilateral
      added error handling for histogram
      median filter cpu backend and cuda/opencl placeholders
      modified symmetric pad equation in medfilt cpu backend
      median filter implementation in cuda backend
      median filter opencl backend implementation
      fft/ifft functions in cpu backend
      fft framework changes
      fft/ifft cuda backend
      fft/ifft opencl backend
      meanshift API and cpu backend
      meanshift cuda backend implementation
      meanshift opencl backend
      createPaddedArray optimizations for cuda and opencl backend
      BUGFIX: copy kernel
      convolve cpu backend
      convolve cuda backend
      convolve opencl backend
      renamed ConvolveBatchKind variables
      Changed output array type for bilateral function
      subscript assignment feature for cpu, cuda and opencl backends
      cmake changes for opencl backend
      C++ wrappers for functions, includes a bugfix as well
      C++ wrappers for image and indexing functions
      Merge branch 'cpp' of ssh://mule/area51/arrayfire into cpp
      Bugfix: corrected array handle check in destructor
      Added index support
      Adding assign operator overloading in CPP
      Merge branch 'origin/cpp' to cpp
      Fixing copy assignment operator
      convenience member functions for array indexing
      changed separable convolve cpp API
      changed gradient cpp API
      additional unit tests for cpp wrapper
      Bugfix: af_assign
      Moved cpp wrapper functions to appropriate files
      regions cpp wrapper
      cpp wrapper unit tests
      Merge fft & convolve headers
      convolve API changes
      Added new cpp wrappers for ffts
      windows fixes for cuda backend
      windows fixes for opencl backend
      fix for google test build command
      Visual Studio File Grouping for Projects
      windows and *nix OS compatibility fixes
      windows fixes for cpu backend
      boost compute fixes for windows, had to undef min and max macros
      Merge branch 'master' into win_fixes
      undef min,max macros before boost/compute headers inclusion
      Commenting out cpu blas funcions temporarily on windows
      Additional fixes in cpu backend for windows platform
      Additional cpp convenience functions for moddims
      added compatibility APIs
      Added NOMINMAX definition for windows platform
      Removing PIC compiler flag for windows platform
      Removing undef min, max as NOMINAX is added for windows
      Merge remote-tracking branch 'origin/master' into ocl_win_fixes
      Corrected gtest library path for debug mode
      Added missing template specilizations for copy
      Corrected visual studio link libraries for  test build process
      Merge remote-tracking branch 'upstream/devel' into ocl_win_fixes
      Updated template specilizations for copy in cuda/cpu backends
      Merge remote-tracking branch 'origin/ocl_win_fixes' into devel
      add formatting to array print functions
      Windows compatibility changes for BLAS on cpu backend
      Merge branch 'devel' into win_cblas_fixes
      windows compatibility fixes
      style changes in cpu blas functions
      typo corrections
      changed dim_type typedef to int from long long
      Fix for copy function in cpu backend
      indexing unit tests for 3d and 4d arrays
      bugfix for cuda on windows
      changing variables names in reduce kernel for cuda backend
      cmake changes for windows MSVC Projects
      correcting test data commit number
      correcting test data commit number
      Changed setContext function scope
      added isDoubleSupported func for opencl backend
      Added double precision checks in opencl
      function to check double precision availability
      handle double precision in opencl tranpose
      adding missing header in testHelper hpp
      mean function
      added getDevice internal function for opencl
      modified buildProgram opencl helper function
      moved ocl kernel resources from stack to heap
      Moved cl_khr_fp64 extension
      Removed cl_khr_fp64 from individual cl files
      opencl device sorting
      Merge branch 'devel' into statistics
      2d convolve performance improvements
      feature: indexing array using array
      cuda backend for indexing array using array
      opencl backend for indexing array using array
      cpp wrapper for array based index
      Removed indices size check in array-index
      using 0 as default for dim to array index cpp wrapper
      bugfix: fixes complex types for mean on cuda/opencl backend
      Merge branch 'devel' into statistics
      Merge branch 'devel' into statistics
      feature: match template
      cpp wrapper for match template
      Corrected typo in median filter opencl kernel wrapper
      bugfix: match template cpp unit test
      Moved match template c api to apt location
      changed shared mem access pattern for conv3d
      Removed long long numeric qualifier for constants
      perf: minor performance improvements for bilateral
      Removed an obsolete condition in af_assign
      Merge branch 'devel' into array_idx
      perffix: 3d separable convolve
      Merge branch 'devel' into statistics
      feature: af_sobel_dxdy
      af_sobel_dxdy CUDA backend
      af_sobel_dxdy OpenCL backend
      cpp wrapper for sobel derivatives
      Changed c api for sobel operator
      Merge branch 'devel' into statistics
      Corrected test data hash tag
      Multiple func definition fix for arith operations: mod and rem
      BUGFIX: added same complex type cast noop
      FEATURE: convience functions for weighted mean
      FEATURE: variance
      FEATURE: standard deviation
      BUGFIX: added static qualifier for helper arithmetic functions
      FEATURE: RGB to GRAY and vice versa color space convertion
      FEATURE: covariance
      Code cleanup for mean, var, stdev
      FEATURE: median function
      FEATURE: correlation coefficient function
      BUGFIX: corrected scalar constant typo in median
      type correction in median removes warnings
      Merge branch 'devel' into statistics
      Code cleanup mean, median, stdev
      BUGFIX: windows fix for division helper function
      BUGFIX: fixed multiple definition error for unaryName function
      FEATURE: histogram equalization for images
      BUGFIX: increased filter/mask length for convolve kernels
      BUGFIX: modified default normalization factor
      PERFFIX: convolution perf improved by 2-4%
      PERFFIX: improved 2d convolve perf in cuda by 33%
      Renamed separable conv cuda kernel file
      Merge branch 'devel' into perf_conv
      PERFFIX: improved opencl 2d convolution peformance by 4%
      modified expand param to default to false for convolution
      BUGFIX: 2d separable convolution
      FEATURE: hsv to rgb and vice versa conversion functions
      FEATURE: colorspace function
      Reduced convolution compilation time
      BUGFIX: added type check for tests on opencl backend
      Adding copyright to examples
      namespace fix in machine learning examples
      Renamed af_array_index backend files to match new name af_lookup
      Documentation for colorspace conversion functions
      Documentation for histogram & histequal
      Moved repeat function docs content to common location for image.h
      Reuse unit tests to write documentation examples
      Removed duplicate lines in mean & var tests
      BUGFIX: fix in af_mean_all for cdouble type
      Removed USE_SYSTEM_GTEST cmake option
      BUGFIX: corrected conv2 filter length constant
      Documented code related to 'How to add function to ArrayFire' wiki
      Style and typo corrections in exampleFunction
      Regions documentation and code example
      Renamed image processing titles for morph & filters subgroups
      Documentation for gaussian kernel functions
      Documentation for Sobel Operator functions
      Documentation for matchTemplate function
      Documentation for medfilt function
      Documentation for meanshift & bilateral functions
      Documentation for Morphological Operator functions
      Documentation for Convolution functions
      Documentation for fft & ifft functions
      Documentation for approx1 & approx2 functions
      Documentation corrections

Pradeep Garigipati (43):
      Read me redirection to repository wiki.
      Basic contribution guidelines for pull requests
      Merge pull request #83 from pavanky/readme
      Adding unit tests related info
      Merge pull request #146 from pavanky/gtest
      Merge pull request #152 from shehzan10/devel
      Merge pull request #162 from pavanky/reorg
      Merge pull request #207 from pavanky/memory
      Merge pull request #222 from shehzan10/devel
      Merge pull request #226 from pavanky/cplx
      Merge pull request #244 from pavanky/jit_fixes
      Merge pull request #248 from shehzan10/devel
      Merge pull request #256 from pavanky/iota
      Merge pull request #257 from shehzan10/devel
      Merge pull request #297 from shehzan10/devel
      Merge pull request #315 from bkloppenborg/devel
      Merge pull request #330 from pavanky/ocl_jit_fix
      Merge pull request #375 from pavanky/jit_fixes
      Merge pull request #379 from shehzan10/devel
      Merge pull request #380 from shehzan10/devel
      Merge pull request #383 from pavanky/clcontext
      Merge pull request #384 from pavanky/random
      Merge pull request #386 from pavanky/ireduce
      Merge pull request #407 from arrayfire/memory
      Merge pull request #415 from umar456/cxx_fix
      Merge pull request #417 from pavanky/gausskern
      Merge pull request #435 from bkloppenborg/warning_fix
      Merge pull request #439 from bkloppenborg/array_indexing
      Merge pull request #447 from pentschev/improve_orb_perf
      Merge pull request #448 from pentschev/improve_fast_perf
      Merge pull request #450 from pentschev/fast_edge
      Merge pull request #455 from bkloppenborg/remove_unneeded_chars
      Merge pull request #460 from bkloppenborg/get_non-zero_dims
      Merge pull request #459 from ogreen/MRead
      Merge pull request #463 from pavanky/minor_fixes
      Merge pull request #462 from shehzan10/devel
      Merge pull request #466 from pentschev/doc_fast
      Merge pull request #471 from pentschev/doc_orb
      Merge pull request #478 from pavanky/bug_fixes
      Merge pull request #495 from pentschev/fix_fast_warning
      Merge pull request #497 from shehzan10/devel
      Merge pull request #502 from bkloppenborg/standalone_examples
      Merge pull request #505 from pavanky/docs

Shehzan Mohammed (269):
      Added af_diff1 function with cpu backend implementation.
      Fixing typo in cuda/opencl placeholders for diff1
      Added af_diff2 function with cpu backend implementation.
      Added randu and randn functions to cpu
      Added AF_<backend> definitions to test
      Added CUDA backend for diff1 and diff2
      Optimized diff to use just two kernels
      Change launch configuration when inputs are just vectors
      Added OpenCL backend for diff1 and diff2
      Fixing ostream << operator for uchar to print numbers
      Added image IO functions to all backends (code is independent of backend)
      Removed flags from CMAKE for cuda build
      Using static channel_split in imageio
      Created image.h header file
      Added CPU backend for resize
      Added CUDA backend for resize
      Added OpenCL backend for resize
      Merge branch 'master' into resize
      Updated OpenCL and CPU with offset changes
      Updated diff CPU with offset changes
      Code cleanup for Resize (all backends)
      Large tests for resize. Minor type fixes for resize.
      Added Transform and Rotate for CPU, CUDA and OpenCL backends
      Merge branch 'master' into transform
      Merge branch 'master' into transform
      Added wrappers for translate, scale and skew. Added tests for rotate
      Added helper functions to ArrayInfo
      Using failure count for rotate tests. Minor type corrections in resize.
      Added == and != operators for dim4
      Added base_type to traits
      Added Approx1 and Approx2 to all backends
      Kernel code cleanup for Approx1,2 linear interp
      Make random test deterministic
      Using Params in cuda kernels
      Added tile to CPU, CUDA, OpenCL backends
      Performance improvement to tile in CUDA, OpenCL
      Fix buildProgram multiple definition error
      Unified kernel arguments for approx
      Unified kernel arguments for diff
      Unified kernel arguments for resize
      Unified kernel arguments for transform
      Unified kernel arguments for tile
      Added error framework to approx
      Added error framework to diff
      Added error framework to resize
      Added error framework to tile
      Added error framework to transform
      Device management for CUDA
      Moved dimension checks for matmul to src/backend
      Added reorder to all 3 backends
      Added circular shift to all backends
      Create af_create_handle wrapper for createEmptyArray
      Added gradient to all backends
      Added empty wrappers for sort
      Added CUDA and OpenCL backends for Sort on dim0
      Added multi-dimensional support for sort on dim 0
      Fixed cudaGetDriverVersion for Mac and ARM
      BUGFIX const correctness in ArrayInfo functions
      Separated sort into sort (only values) and sort_index (values and indices)
      Merge remote-tracking branch 'origin/master' into sort
      Increase tolerance for rotate test
      API change for sort
      Fix for FindCBlas.cmake on debian based OS
      Fixing blas uninitialzed warnings
      Optimizing loops in sort
      Added sort_by_key to all backends
      Merge remote-tracking branch 'origin/master' into sort
      Added Boost 1.48 requirement to OpenCL CMake (for compute)
      Fixing warnings for ostream in opencl/kernel
      BUGFIX for cuda random number generation on multiple devices
      Update for device manager for OpenCL and CPU
      Added test to print af_info
      Fix for erroring out if Boost.Compute not found for OpenCL
      Split sort* functions into separate files
      Updated README.md
      Separated functions into header files
      Disable large sort tests
      Added af_copy_array function for deep copy
      Added helper functions to c++ wrapper
      Added implementation for af::print
      Added examples folder
      Merge branch 'cpp' of mule:area51/arrayfire into cpp
      Merge remote-tracking branch 'origin/master' into cpp
      Added CPP wrappers for blas, device, data, reduce header files
      Added Version.h functionality. Updated info()
      Added c-api wrapper for weak copy
      Updated operator= for array class
      Added constant function to create complex arrays
      Added wrapper for unary and binary operations
      Added operator overloading for arithmetic and relational operations
      Moved operator overloading to src/array/arith.cpp
      Fixed af_print to print regular array (not transposed)
      Added iota function to C and C++ API
      Merge branch 'cpp' into 'master'
      Fix for isnan, isinf compilation error
      Overload sort C++ API
      Change dim4 from struct to class
      Added cmake code for CUDA Compute variables
      Removed usage of ArrayInfo from C++ API
      Merge pull request #40 from arrayfire/info_cpp_fixes
      OSX Compilation Fixes
      Using *operator instead of pow
      Merge pull request #41 from arrayfire/osx_fixes
      BUGFIX Fix sort on CPU
      Added seq class for C++ API
      Updating helloworld example
      Merge pull request #44 from bkloppenborg/cuda_6_0_compile_fix
      Make FreeImage library optional
      Change FREEIMAGE_FOUND to WITH_FREEIMAGE
      Added AF_ERR_NOT_CONFIGURED, added to imageio
      Merge pull request #49 from arrayfire/freeimage-optional
      Upgrading C++11 flag from 0x to 11
      Added #if __cplusplus around utils
      BUGFIX op= in array class
      Added timing code
      Fixed CPU random generator
      Merge pull request #73 from kylelutz/reduce-header-includes
      Fixed whitespaces, unsigned warning, removed printf
      Merge pull request #80 from bkloppenborg/cmake_additional_install
      Merge pull request #82 from bkloppenborg/cmake_additional_install
      Update README.md with Jenkins Build tags
      Updated README.md
      Choose CUDA default device using AF_CUDA_DEFAULT_DEVICE
      Choose OPENCL default device using AF_OPENCL_DEFAULT_DEVICE
      Merge pull request #89 from arrayfire/default_device
      Added round, floor, ceil and abs instances to CPU
      Commit for CUDA PTX update fix for abs
      Merge pull request #90 from pavanky/device
      Merge pull request #93 from pavanky/bugfixes
      Merge pull request #95 from pavanky/bugfixes
      Merge pull request #100 from 9prady9/readme
      Remove use of cmake variable CUDA_DRIVER_LIBRARY
      Merge pull request #110 from bkloppenborg/add_cmake_find_script
      Added option to not run CUDACmputeCheck.cmake
      Formatting changes in cuda/CMakeLists.txt
      Removing -DWINDOWS_REMOTE and just checking CUDA_COMPUTE_CAPABILITY
      Improved performance of tranpose
      Updated README.md with windows build tag
      Update README.md
      Remove cmake whitespace warning
      Merge pull request #121 from 9prady9/ocl_perf_fix
      Update README.md
      Merge pull request #123 from pavanky/jit_fixes
      Updated README.md
      Increase tolerance for approx tests
      Compilation fix for null pointer in constructor
      Fix for get device ptr
      Merge pull request #147 from pentschev/regions_tests_fix
      Update submodules
      Added deviceprop functionality to CUDA and OpenCL.
      Changed REVISION to AF_REVISION
      Merge pull request #153 from arrayfire/update_test_submodule
      Update submodules
      Added deviceprop functionality to CUDA and OpenCL.
      Changed REVISION to AF_REVISION
      Merge pull request #154 from pavanky/arith
      Calling af_init to initialize contexts
      Merge pull request #158 from bkloppenborg/platform_format_fix
      Added linear interpolation for transform (adds for rotate, scale, skew etc)
      Added tests for bilinear transforms (rotate)
      Fixes for rotate bilinear tests
      Removed device selection from transpose test
      Use & operator for array in resize definition
      Merge pull request #177 from pentschev/fast_opencl_fix
      Merge pull request #180 from pavanky/perf
      Fix cmake error caused by same filename in test and example
      PERF Improvements to transform and rotate
      Wrappers for resize
      Add missing license headers to transform_interp
      Added tests for iota
      Added placeholders for join
      Added CPU backend for join
      Added CUDA backend for join
      Added OpenCL backend for join
      Added tests and linking test data for join
      Better error handling in ImageIO
      PERF Batching + Blocks images in rotate and transform
      Fixed tests for imageio after error changes
      Split sort_by_key instantiation into multiple files
      Changed dir to isAscending in all sort functions
      Change CMAKE_SOURCE_DIR/common to CMAKE_MODULE_PATH
      Moved common to CMakeModules
      Default arguments for medfilt
      Change int to dim_type in cuda transpose
      PERF Improvement to transpose opencl
      Added conjugate option to transpose
      Added transpose .T() and conjugate transpose .H() to array class
      Changed minor version from .200 to .beta
      Merge pull request #220 from pavanky/perf
      Adding noDoubleTests condition to all tests
      Compilation fix for cpu complex.hpp
      Added pinned memory functionality
      Added warning messages for when submodules are not cloned
      Merge pull request #240 from pavanky/atan2
      Merge pull request #241 from pavanky/test
      Merge pull request #242 from pavanky/hypot
      BUGFIX Fixed memory leak in image io, performance improvements
      Merge pull request #246 from pavanky/new_funcs
      Added dim checks for binary element wise ops
      BUGFIX Fix segfault in copy array
      Merge branch 'memory' of github.com:pavanky/arrayfire into devel
      Fixed references to shared_ptr for cpu and opencl backend
      Fixed compilation fix for identity
      Correcting typo in FindFreeImage
      Added memory manager for pinned memory
      Using pinned memory in imageio
      Changed API for iota
      BUGFIX Fixed index-based array operators
      Removing boost chrono required from opencl
      Merge pull request #277 from pavanky/constants
      Merge pull request #278 from pavanky/api
      Merge pull request #279 from umar456/clang_warn
      Merge pull request #283 from pavanky/minor_fixes
      BUGFIX Reorder condition fix
      BUGFIX for generating array from seq using negative step
      Merge pull request #299 from pavanky/gfor
      Compilation fix for Windows
      Merge pull request #304 from pavanky/bug_fixes
      Call init from pinned memory and load image
      Added 4th dimension support to resize
      Changed tiling block from y to x in rotate kernels
      Added 4th dimension support to rotate
      Merge pull request #310 from 9prady9/array_idx
      Added 4th dimension support to transpose
      BUGFIX in set device for cuda
      BUGFIX Fixes seq generation for positive numbers
      Merge branch 'devel' of github.com:arrayfire/arrayfire into devel
      Merge pull request #316 from bkloppenborg/devel
      Split sort_by_key instantiation into multiple files for CUDA backend
      Moved sort_by_key instantiation files into directory
      Merge pull request #345 from pavanky/more_fixes
      Compilation fix for windows
      Fixed clBLAS/clFFT libs install for Windows and OSX
      Merge pull request #371 from pavanky/fixes
      Updated README.md
      Fix windows compile issue for opencl context handling
      BUGFIX Fixed sobel output types
      Sobel returns int for integer types instead of float
      Merge pull request #382 from pavanky/ocl_build
      Fixed backend API for join
      Merge pull request #392 from pavanky/flat
      Disable key testing for sort_index and sort_by_key for OpenCL
      Merge pull request #399 from pavanky/copy_fixes
      Merge pull request #401 from 9prady9/conv_fixes
      Add pragma once to copy.hpp
      Better backend API for iota (allow default argument for reps)
      BUGFIX seq ops
      Changed DIRECTORY to PATH in examples/CMakeList.txt
      Add warning for not cloning gtest submodule
      Remove GIT_SUBMODULES from build_gtest. Not supported on older Cmake
      Merge pull request #438 from umar456/double_test
      BUGFIX for indexing after JIT ops
      Fixes to save image
      Merge pull request #452 from pavanky/jit_fixes
      Fixes and code optimization to join
      Merge pull request #458 from 9prady9/win_fix
      API Change iota to range
      Update test data for orb
      Merge pull request #465 from pentschev/remove_fast_memset
      BUGFIX Fix windows is_same ambiguity
      Merge pull request #481 from pentschev/match_fast_results
      Merge pull request #490 from umar456/mnist
      API Change order of data that range generates
      Added operators for dim4
      Added new functionality iota
      Removing test/range. Will add back when corrected for new functionality
      BUGFIX Fix offsets and strides when using moddims
      Merge pull request #506 from bkloppenborg/example_fix
      Merge pull request #511 from umar456/assets

Umar Arshad (101):
      Initial Commit
      FIX: Don't build test files if BUILD_TEST is OFF
      Removed MESSAGE from opencl CMakeLists.txt
      Fixed gcc47 parsing error
      Removed tagged struct initilization to support older compilers
      Fixed #1: Building OpenCL library fails on Linux
      Fix #16: Patch command need to be used instead of svn patch
      Merge branch '16' into 'master'
      Merge branch 'data' into 'master'
      Automatic download of arrayfire test data
      forgot GetTestData.cmale
      Only run tests whose binaries are being built
      Fixed errors which popped up on Ubuntu related to pthreads
      Merge branch 'ubuntu' into 'master'
      Merge branch 'datatypes' into 'master'
      Merge branch 'test_definition' into 'master'
      Consistant naming for headers in src/backend
      Cleanup rand functions. Remove macros
      Instantiate Array distructors
      Merge branch 'master' into rand
      Cleanup. Improve readability in random.
      Merge branch 'ocl_cmake' into 'master'
      Merge branch 'random' into 'master'
      Fix various things to get clang working on OSX
      Merge branch 'master' into clang_fix
      Compile on Linux
      Replace operator overload with ToNum.
      Merge branch 'fix_print_uchar' into 'master'
      Merge branch 'simple_index' into 'master'
      CPU GEMM, GEMV, and DOT
      Tests for GEMM and GEMV
      CUDA GEMM, GEMV and DOT
      Merge branch 'cuda_reduce' into 'master'
      Merge branch 'cpuscan' into 'master'
      Merge branch 'testHelper_fix' into 'master'
      BLAS on OpenCL using clBLAS library.
      Cleanup CMake files(i.e. remove messages)
      Changes to compile on Linux. Fix warnings on g++.
      Merge branch 'clrand' into 'master'
      A more robust FindCLBLAS.cmake file
      Removed unnecessary code from tests.
      Merge branch 'master' into blas
      Formatting changes. Fix leak in CPU. Enable dot
      Initial Error commit
      Updated interface for error checking.
      Created a common exception handling format
      Merge pull request #140 from 9prady9/cuda_fix
      Merge pull request #142 from 9prady9/msvc_filters
      Fixes #167: Check if driver is unloaded when freeing array
      Fix warnings in CPU backend on clang
      Added the __ANSI_STRICT definition for OSX
      Compile using g++ on OSX
      Remove unnecessary instanciations of morph in OpenCL
      Merge pull request #267 from mlloreda/macosx_rpath_fix
      Fix g++ warnings on OSX
      Fix linker warning on OSX
      Fixed warning due to if/switch
      Merge branch 'clang_fixes' into clang_warn
      Merge pull request #327 from pavanky/bugfix
      Merge pull request #328 from pavanky/gfor_seq
      Merge pull request #340 from pavanky/bug_fixes
      Merge pull request #356 from pavanky/math_funcs
      Merge pull request #361 from pentschev/orb_fixes
      Merge pull request #359 from jramapuram/devel
      Merge pull request #370 from arrayfire/ptxgen
      Initial documentation for ArrayFire 3.0
      Simple description for array constructor and BLAS
      Merge pull request #390 from shehzan10/devel
      Fixed incorrect use of std::map::erase in OpenCL
      Merge pull request #398 from pavanky/64bit
      Merge pull request #402 from pavanky/cleanup
      Merge pull request #405 from 9prady9/fft_fix
      Remove C++11 conditional from src/api/c.
      Moved assets folder/submodule to the root dir.
      Added basic C interface functions.
      Merge branch 'devel' into docs
      Merge pull request #422 from arrayfire/examples
      Use the chromium repository to build gtest.
      Merge pull request #426 from pavanky/bug_fixes
      Fix unused variable warnings in convolve_separable
      Merge pull request #434 from pavanky/examples
      Double precision checks in testing
      Merge pull request #440 from pavanky/dbn
      Merge pull request #442 from pentschev/orb_blurring
      Merge pull request #446 from pavanky/rbm
      Merge pull request #457 from pentschev/change_fast_datatype_internals
      Merge pull request #443 from 9prady9/cspace_hist_docs
      Faster DBN convergence. Test updates
      Merge branch 'devel' of github.com:arrayfire/arrayfire into dbn_rand
      Merge pull request #467 from pentschev/fix_fast_edge_assert
      Make tests C++03 complient.
      Merge branch 'devel' into clang
      Remove additional c++11 features from test
      Make tests using libstdc++ for clang builds on OSX
      Removed messages an unnecessary functions
      Merge pull request #484 from glehmann/fix-gtest-byproducts
      Merge pull request #489 from pavanky/perf
      Added display ASCII display function for MNIST.
      Merge pull request #496 from pavanky/dox
      Merge pull request #508 from 9prady9/image_docs
      Fix ASSET_DIR path

chris (3):
      opencl build program fixes for osx
      optional double support for jit.cl
      fix double support in jit.cl

easuter (2):
      Use "Subversion_SVN_EXECUTABLE" explicitly instead of "svn patch".
      Correction to commit e7ab4f6

firemanphil (1):
      Fix "Could not read from remote repository" issue

greenman (2):
      Fixed typo
      Modified readme file with additional ArrayFire contact info

jramapuram (3):
      Update CMakeLists.txt
      allow for cuda7+ and backwards changes
      comma to space

mlloreda (9):
      Initial 1D sort implementation for cpu.
      Sorting across first dimension
      Merged origin sort to local
      no more async; non-global indexing
      test/sort.cpp: test on working types
      src/backend/sort.cpp: disable s8 for sort
      src/backend/sort.cpp: modify dim assertion for 2D
      test/sort.cpp: 2D test
      CMakeLists.txt: Update OSX RPath settings

ogreen (2):
      Update README.md
      Update README.md

orbitcowboy (1):
      Fixed potential memory leaks. Each array allocated with new [n] must be deallocated using delete [].

pavan at arrayfire.com (3):
      BUGFIX: Buffer nodes from subArrays now use the parent ptr and offsets
      Tests use google test present on system if BUILD_GTEST=OFF
      Adding a new option to use system GTEST

pradeep (3):
      Moved helper functions to common source file to enable resuse
      af_moddims function
      Bugfix for opencl resource cleanup on windows

-----------------------------------------------------------------------

No new revisions were added by this update.

-- 
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/arrayfire.git