[arrayfire] annotated tag v3.0beta created (now be64535)
Ghislain Vaillant
ghisvail-guest at moszumanska.debian.org
Fri May 22 10:19:48 UTC 2015
This is an automated email from the git hooks/post-receive script.
ghisvail-guest pushed a change to annotated tag v3.0beta
in repository arrayfire.
at be64535 (tag)
tagging a4a26ce4eabe29a84e3c81a456d044854798c5d0 (commit)
tagged by Pavan Yalamanchili
on Mon Mar 16 02:30:07 2015 -0400
- Log -----------------------------------------------------------------
Beta release of 3.0
Brian Kloppenborg (48):
CMakeLists.txt does not need to be executable.
Add definition and directive to fix BOOST_INLINE not being defined on nvcc / CUDA < 6.5. Move FIND command for CUDA into backend/cuda/CMakeLists.txt
Restore default compile state.
Merge branch 'cuda_6_0_compile_fix'
Use ArrayFire's CUDA_VERSION instead of CMake-specific detection.
Add install steps.
Move macro to the top of the file.
Added description, renamed Requirements to Prerequisites, put clBLAS and clFFT subheadings as h4 under OpenCL backend, point clFFT/clBLAS to ArrayFire fork, formatting.
Use standard package handling, search in both system and local paths for clBLAS, add CLBLAS_ROOT_DIR hint.
Add install steps for CUDA and OpenCL. Add workaround for clFFT and clBLAS installers.
Move OpenCL into backend CMakeLists.txt
Prefer system libraries.
Leave CUDA and OpenCL off by default.
Adding install instructions for tests and examples.
Only install .h and .hpp files. Exclude the .gitignore file from the installation step.
Include .hpp files.
Remove installation of tests and examples.
Add CMake find script from arrayfire_benchmark.
Fixes #107
- Add blas as CPU backend dependency
Fixed merge conflicts.
Strip whitespace from OpenCL device information.
Merge pull request #161 from easuter/devel
Add FindSubversion.cmake to ensure Subversion_SVN_EXECUTABLE is set.
Quote user-supplied paths. Search CMAKE_INSTALL_PREFIX last to prefer local installs.
Fix signed to unsigned comparision warning.
Fix signed vs. unsigned integer comparison.
Bugfix for signed vs. unsigned comparison error.
Fix no return from non-void function.
Fix out-of-bounds memory access in array-based indexing.
Fix pedantic compiler warning.
Add function to test/find first non-zero dimension
Return user-specified dimension.
Install ArrayFire version file (version.h)
Fix documentation source directory for install
Merge with upstream, fix conflict.
Merge pull request #470 from glehmann/cmake-config
Remove execution bit.
Create .tar.gz package for libraries and documentation using CPack
Make example CMakeList standalone
Merge branch 'devel' into cmake_packaging
Fix missing asset definition.
Package examples
Fix incorrect reference to ArrayFire libraries from FIND script.
Restore example naming convention and output directories.
Add missing includes for stand-alone compliation of examples.
Add copyright to header.
Install example assets along with examples.
Casey Goodlett (1):
Fix googletest build on other cmake build types
Gaëtan Lehmann (4):
fix gtest build with ninja and simplify gtest external project
fix gtest byproducts
install arrayfire cmake configuration and version files
display a warning when the assets can't be found in the source dir
Gallagher Pryor (2):
fix for building gtest on systems w/ svn < v1.8
Merge branch 'test' into 'master'
John Melonakos (1):
Adding the license info for all source files in ArrayFire
Kumar Aatish (1):
Changed OpenCL library search path order
Kyle Lutz (3):
Add setUnique()/setUnion()/setIntersection() for OpenCL
Use OpenCL error codes from Boost.Compute
Reduce Boost.Compute header includes
Miguel Lloreda (1):
fixed formatting
Nathan Jackson (4):
Added mean calculation support for complex types.
Added tests for computing the mean of complex values.
Moved division function into math utilities. Fixed mean function.
Added variance interface and var_all implementation.
Pavan Yalamanchili (649):
Renaming the helper files
API changes for data transfer functions
Create context using DEFAULT instead of CPU for OpenCL backend
Data type changes to make the backends self contained
Merge branch 'testHelper' into 'master'
BUG fix: Fixing CalcBaseStride for greater than 2 dimensions
Merge branch 'cpu_diff1' into 'master'
Merge branch 'moddims'
Fixing build errors from diff branch
Fixing warnings from the new tests
Cleanup of src/backend
Merge branch 'transpose' into 'master'
Fixing warnings in transpose tests
Merge branch 'random'
Fixing constructors for CUDA and OpenCL
Cleaning up CMAKE files to automatically pick up source files
CMAKE Fix: Explicitly state source extensions
Fixing the formula for baseoffset.
FEAT: Reductions for CPU added
Updating the formula to work with negative and strided offsets
Cleaning up diff kernels to have unified functions
Merge branch 'diff_cuda' into 'master'
Merge branch 'cuda_trs' into 'master'
Merge branch 'consistant_headers'
Changed the headers from earlier merges to be consistent as well
Merge branch 'reduce'
Merge branch 'rand'
Adding explicit methods to modify ArrayInfo
Removed all references to af_array inside src/backend/*/.
Make all Array<T> constructors private
FEAT: randu and randn for CUDA implemented
Merge branch 'ocl_fixes' into 'master'
Changing loop iterators to proper type
A better way to handle the template specialization of size_t
Merge remote-tracking branch 'origin/diff_opencl'
BUG Fix: calcStride was accessing out of bounds.
Fixing the bugfix to calcStrides
Separate out reduce and transform functors
Merge remote-tracking branch 'origin/ocl_transpose'
Using dim_type in diff for opencl instead of size_t
Removing trailing whitespaces
Getting rid of .cu files in src/backend/cuda/kernel/
Remove the unnecessary template instantiations
Minor changes post merge to imageio.cpp
Merge remote-tracking branch 'origin/cpu_histogram'
Fixes for tests to compile properly by adding std:: prefix
Style change to fix the compiler warnings on gcc 4.9
Indexing support for CUDA backend
Indexing support for OpenCL
Removing unnecessary print from test/index.cpp
Enabling diff tests and minor fix to work with indexing
Enabling transpose tests and fixes to make tranpose pass the tests
Merge remote-tracking branch 'origin/resize'
Merge remote-tracking branch 'origin/cpu_morph'
Merge remote-tracking branch 'origin/cpu_bilateral'
Changing variable name to be consistent with the rest of the file
Moving the functions in cuda/complex.hpp to global namespace
Changing ops.hpp and */backend.hpp to work nicely with NVCC
Removing unnecessary include file
Reductions for CUDA backend
Accum implementation for CPU backend
Merge remote-tracking branch 'origin/info_helpers'
Merge remote-tracking branch 'origin/cuda_morph'
Bug fix to random number generation in CUDA
Adding random number generation support to OpenCL backend
Merge remote-tracking branch 'origin/ocl_morph'
Merge remote-tracking branch 'origin/transform'
Merge remote-tracking branch 'origin/blas'
Merge remote-tracking branch 'origin/cuda_bilateral'
Merge remote-tracking branch 'origin/ocl_bilateral'
Merge branch 'ocl_morph_opt' into 'master'
Merge remote-tracking branch 'origin/cuda_histogram'
Merge remote-tracking branch 'origin/approx'
Merge remote-tracking branch 'origin/ocl_histogram'
Merge remote-tracking branch 'origin/random'
Changing the location of the data repository
Added Param and CParam structs that can be passed to the GPU
Renaming helper functions and functors
Unified print function for all backends
Cleaning up header files
Merge remote-tracking branch 'origin/header'
Merge remote-tracking branch 'origin/master' into unify
Merge remote-tracking branch 'origin/intel_histfix'
Adding Param<T> to the remaining functions in CUDA backend
Merge remote-tracking branch 'origin/unify'
Merge branch 'ocl_dselector' into 'master'
Adding a missing std:: in opencl/platform.cpp
Adding __CL_ENABLE_EXCEPTIONS to the build process
Merge branch 'origin/ocl_kernel_caching'
Merge remote-tracking branch 'origin/tile'
Add caching support for tile in xOpenCL
Fixing copy paste error
Introduced a common struct and build function for OpenCL kernels
Changing beta == 0 instead of memsetting C to 0 in gemm
Merge branch 'origin/blas_fix'
Merge remote-tracking branch 'origin/unify'
extended CATCHALL to include Type and Support errors
AfError now supports line numbers and user specified af_errs
Added *_NOT_SUPPORTED macros for each backend
Added macro CUDA_CHECK that checks for cudaError and throws AfError
change cldebug to debug_opencl
Added POST_LAUNCH_CHECK to CUDA backend
Added new error type --> ArgumentError
Changed backend/reduce.cpp to include the new error mechanisms
Changed backend/diff.cpp to use new error checks
Changed backend/morph.cpp to use new error checks
Changed SHOW_CL_ERROR() to CL_TO_AF_ERROR() in opencl backend
Fixing a minor bug for ArgumentError
Fixing the dimension checks for backend/morph.cpp
Fixing the morph tests to check for correct errors
Moving ARG_ASSERT to within try catch blocks
Merge remote-tracking branch 'origin/error'
Cleaning up a couple of lines
buildProgram now accepts multiple source files
added iscplx to backend/opencl/traits.hpp
Reductions backend for OpenCL
Merge branch 'clreduce' into 'master'
Merge remote-tracking branch 'origin/cuda_device'
Merge remote-tracking branch 'origin/matmul_fixes'
Merge remote-tracking branch 'origin/reorder'
Fixing issues with CUDA reductions
Fixing typo in scan tests
Cleaning up reductions code a bit more
Scan algorithm for CUDA implemented
Change to make sure autogenerated string headers are only included once
Style clean up of OpenCL reduction code
Scan algorithm for OpenCL backend
Merge remote-tracking branch 'origin/shift'
Merge remote-tracking branch 'origin/scan'
Merge remote-tracking branch 'origin/gradient'
Merge remote-tracking branch 'origin/cpu_medfilt'
Updating gitignore to include unwanted emacs files
Merge remote-tracking branch 'origin/cuda_medfilt'
Merge remote-tracking branch 'origin/ocl_medfilt'
Bug fix to gradient in CUDA and OpenCL backends
Merge remote-tracking branch 'origin/intel_scan_fix'
Cleaning up buggy strided dimensions in scan for CUDA and OpenCL
Merge branch 'scan_bugfix'
Launch configuration fix for AMD GPUs
Merge remote-tracking branch 'origin/cpu_fft'
Cleaning up backend/scan.cpp to include proper error checks
Adding support for where for CPU backend
Fixing corner cases in scan algorithm for CUDA and OpenCL backends
Exceptions now display file names instead of function names
Added a new function to create Array<T> from Param<T>
Where implemented for CUDA backend
Making the double buffering in OpenCL backend more explicit
Change scan tests to run on OpenCL devices available on the system
Adding support to create Array<T> from Param in OpenCL backend
Changing where in CUDA backend to pass out by reference
Style changes to OpenCL scan function
Tentative support for where in OpenCL backend
Merge remote-tracking branch 'origin/fft'
Modifying FindclFFT.cmake to look in clFFT build directory
Change required to suppress comparision warnings
Removing unnecessary variadic templates
Removing "static" from template specializations
Changes to cuda and OpenCL backends to improve parallel compiles
Merge remote-tracking branch 'origin/compile'
Making sure ret in imageio is initialized before returning
Adding dl libs explicitly to the OpenCL backend
Merge branch 'cpu_where'
Fix header locations to fix compilation in debug mode
Merge remote-tracking branch 'origin/meanshift'
Merge remote-tracking branch 'origin/pad'
Merge remote-tracking branch 'origin/master' into where
BUG: Fixed boundary checks for scan_first in CUDA and OpenCL
Passing Params as references to where_* in OpenCL backend
Merge remote-tracking branch 'origin/where'
Added simple JIT kernel generation for OpenCL backend
Updated the OpenCL backend to have simpler kernel name generation
Kernel compilation and Launching added for OpenCL JIT backend
Reorganizing files in src/backend
Cleaning up jit.cpp
Adding logical functions to the external API
Adding the last few binary functions
Adding cast function to OpenCL JIT backend
Unary functions added to OpenCL JIT backend
Adding new binary functions to OpenCL JIT backend
Adding support for ScalarNodes in OpenCL JIT backend
ndims() now returns atleast 1 instead of 0 from before.
Added proper error checking to af_print
Fixing the API of af_cast
Adding cache support for OpenCL JIT kernels
Merge branch 'bug_fixes' into 'master'
Changing the implicit cast behavior to mimic c/c++
Adding CUDA/CPU_NOT_SUPPORTED macros for elementary operations
JIT kernel generation support for OpenCL backend
Removing unnecessary member variables from BufferNode
BUG fix in JIT kernel generation in OpenCL backend
Merge branch 'jit'
Change required to make blas compile on centos 6
Changing the cpu blas to depend on CBLAS instead of blas
Merge remote-tracking branch 'origin/blas_fix'
Fixes for OpenCL backend for gcc 4.7.2
Updating the README.md
Merge remote-tracking branch 'origin/ocl_fix'
Adding FindMKL.cmake to ArrayFire repo
Merge branch 'rotate_fix' into 'master'
Merge branch 'fft_fix' into 'master'
Bug fix in speicalizations for max<cfloat> and max<cdouble>
Enabling double precision support for JIT kernels
Fixing typo in opencl/jit.cpp
Fixing the initial value for max on complex numbers
Merge branch 'sort' into 'master'
Merge branch 'warning-fixes' into 'master'
Merge branch 'random_fix' into 'master'
Merge branch 'platform_fixes' into 'master'
Update README.md to have better formatting.
Merge branch 'conv' into 'master'
Stripping end of line characters from README.md
Removing unnecessary line from cuda/CMakeLists.txt
First draft of CUDA JIT
Added FindNVVM.cmake
Adding libcuda as a dependency
Changes to make nvvm code to compile and execute
Making the child nodes decide the types when calling functions
Removing untracked folder from the repository
Removing untracked folder from the repository
Adding support for CAST and COMPLEX operations in CUDA backend
Adding back tests in `basic.cpp` for CUDA backend
Merge branch 'sort_split' into 'master'
Merge branch 'bilateral_fixes' into 'master'
Merge branch 'compute_cmake_fix' into 'master'
Merge branch 'subref_assign' into 'master'
Merge remote-tracking branch 'origin/jit'
Element wise support for CPU backend
Merge branch 'header-files' into 'master'
Merge remote-tracking branch 'origin/TNJ'
CPU backend now uses std::shared_ptr for holding data
Using boost::shared_ptr for reference counting in CUDA backend
Renaming files in CUDA backend
Adding support for weak copy in src/backend/*.cpp
Merge branch 'ocl_cmake_changes' into 'master'
Merge branch 'ref'
Bug fix for OpenCL backend when creating empty Arrays
Adding basic functions to the C++ API
Merge branch 'cuda_limit' into 'master'
BUG_FIX: bin2cpp now adds NULL character towards the end of string
bin2cpp now adds newline for CUDA but does not for OpenCL
Merge branch 'regions' into 'master'
Adding the license file to the repo
Merge remote-tracking branch 'origin/master'
Updating README.md to include clone command and fftw dependency
Updating the arrayfire_data repo URL
Updating README.md
Merge pull request #25 from arrayfire/cpp_tests
Merge pull request #28 from arrayfire/nan_inf_fix
Merge pull request #33 from arrayfire/sort_cpp
Merge pull request #31 from arrayfire/cmake_cuda_compute
Merge pull request #35 from arrayfire/api_changes
Merge data.h and reduce.h into algorithm.h
Moving constant, randu and randn into af/data.h
Moving approx1 and approx2 to af/signal.h
Merge pull request #38 from arrayfire/header
Moving important utility functions from data.cpp to handle.hpp
Cleaning up the sort functions
Adding set functions for the CPU backend
Merge pull request #42 from kylelutz/opencl-set
Fixing the iterators for union and intersect in OpenCL backend
Adding set operations for CUDA backend
Reducing the memory footprint for set_intersect
Style changes
Merge pull request #43 from arrayfire/set
Merge pull request #48 from arrayfire/seq
Making arrayfire_data a submodule
Scoping out unimplemented code
Adding af_eval() and array::eval()
Adding af_get_device and af_sync to all backends
Merge pull request #56 from arrayfire/data
Merge pull request #58 from arrayfire/eval
Merge pull request #59 from arrayfire/win_fixes
Merge pull request #64 from arrayfire/timer
Merge pull request #65 from 9prady9/additional_api
Fixing a bug with +=, -=, *=, /=
BUG fix: evaluate array before assignment operator
Cleaning up moddims
Changing enum so it does not clash with functions
Merge pull request #66 from pavanky/misc
Update CONTRIBUTING.md
Merge pull request #69 from arrayfire/devel
Merge pull request #72 from kylelutz/opencl-error-codes
Merge pull request #74 from bkloppenborg/cmake_install
Merge pull request #76 from arrayfire/minor-fixes
Fixing compilation warnings in CPU and CUDA backends
Merge pull request #79 from pavanky/warnings
Merge pull request #75 from arrayfire/devel
Reorganizing helloworld.cpp
Updating README.md
Update README.md
Merge pull request #85 from 9prady9/format_array_print
Merge pull request #87 from arrayfire/devel
Adding new functions to src/backend
Adding new methods to af::array class
Merge pull request #91 from shehzan10/unary-fix
Updating the commit hash of test/data submodule
Destroy temporary variables from binary.cpp
Make sure functions are not being declared more than once in NVVM IR
Fixing a bug in CUDA backend to reset flags properly
Merge pull request #94 from 9prady9/win_cblas_fixes
Merge pull request #96 from arrayfire/devel
Update README.md
Merge pull request #98 from pavanky/readme
Merge pull request #97 from kaatish/ocl_cmake_changes
Merge pull request #99 from mlloreda/patch-1
Fixed formatting
Merge pull request #104 from firemanphil/master
Merge pull request #102 from gcasey/buildfixes
Merge pull request #105 from shehzan10/devel
Merge pull request #112 from shehzan10/cuda_build_fix
Merge branch 'issue_107' of https://github.com/bkloppenborg/arrayfire into devel
Merge pull request #116 from shehzan10/transpose_perf
Fixed bugs in ScalarNode for CUDA and OpenCL JIT backends
Unary math functions convert arrays to floating point arrays
Merge pull request #124 from 9prady9/index_tests
Bug fix for reductions in CUDA backend
Bug fix to random number generation in CUDA backend
Removing deprecated files from the repo
Suppress them warnings
Fixing leaks in CUDA JIT backend
Fixing Leaks in OpenCL JIT backend
Fixing Memory leaks in CPU TNJ
Merge branch 'pavanky/jit_fixes' into bugfixes
Merge pull request #131 from pavanky/bugfixes
Moving src/array to src/frontend/cpp
Adding support for binary functions with scalar inputs
Removed s8. Changed b8 to be of type char
cast to b8 now results in arrays made up of 1s or 0s
Add a debug version of CU_CHECK
updated math operations for all backends
Merge remote-tracking branch 'origin/arith' into devel
Fixing complex function support in arrayfire
Adding data check functions: isNaN, isInf, iszero
Unifying af_constant_c32/c64 into af_constant_complex
Support for global reductions in CPU backend
Merge pull request #150 from bkloppenborg/findarrayfire_fixes
Merge pull request #157 from shehzan10/devel
Support for global reductions in CUDA backend
Global reduction support for OpenCL backend
changing reduce_global --> reduce_all
Merge remote-tracking branch 'pavanky/algos' into devel
Reorganizing the directory structure
Unified the way complex numbers are printed
Merge pull request #163 from shehzan10/transform_linear
Merge pull request #165 from shehzan10/devel
PERF: improvements to random number genration in CPU backend
Wrapping af_get functions in AF_CHECK macro
PERF: improvements to random number generation in CUDA backend
Merge pull request #170 from umar456/devel
PERF: improvements to random number generation in OpenCL backend
Merge pull request #173 from bkloppenborg/FindclFFTImprovements
Merge pull request #174 from shehzan10/devel
Merge pull request #175 from pentschev/fast
Merge pull request #179 from pentschev/fast_return_fix
Merge branch 'devel' into perf
PERF: improvements to CUDA JIT when memory is linear
PERF: improvements to OpenCL JIT when memory is linear
EXAMPLE: Monte Carlo estimation of PI
BUGFIX: in JIT for CUDA backend
Merge branch 'devel' into ocl_win_fixes
correctly adding USE_DOUBLE to OpenCL JIT
Merge pull request #188 from arrayfire/ocl_win_fixes
Fixing the commit id of test/data submodule
Merge pull request #190 from shehzan10/devel
Merge pull request #191 from 9prady9/ocl_dev_sort
PERF: Added memory manager for CUDA backend
PERF: Added memory manager for CPU backend
Merge pull request #192 from shehzan10/join
Merge pull request #200 from shehzan10/imageio_fixes
Merge pull request #202 from shehzan10/devel
Merge pull request #205 from shehzan10/sort_fixes
BUG: Fix in memory manager for CUDA backend with multiple devices
Changing new/delete to malloc/free for CPU backend
Adding variable names for MAX_BUFFERS and MAX_BYTES
Changing Array.data from cl::Buffer to cl::Buffer *
Fixing memory leak inside af_print_array
PERF: Added memory manager for OpenCL backend
Adding C api calls for malloc and free
Changing the error message for pinned memory alloc / free
Merge branch 'devel' into memory
Fixing typo / bug in implicit.cpp
Merge pull request #209 from shehzan10/devel
Merge pull request #219 from shehzan10/devel
PERF: using cuda::mem{Alloc,Free} instead of cuda{Malloc,Free}
PERF: Improvements to reductions in CUDA and OpenCL
BUG: Fixed issues with binary operations with scalar on LHS
Updating math_ptx submodule
Adding abs support for complex numbers
Merge pull request #227 from 9prady9/conv2d_perf_fixes
Merge pull request #229 from shehzan10/devel
Updating CONTRIBUTING.md
Merge pull request #238 from shehzan10/devel
Merge pull request #239 from bkloppenborg/devel
BUG: Fixed issues with atan2 in CUDA and OpenCL backends
TEST: Adding global reduction tests
Changing af::af_cfloat to af::cfloat for C++ API
Properly catching and returning errors from af_sort*
TESTS: Adding tests for math functions
TEST: Adding tests for binary functions
FEAT: Adding support for hypot
TEST: Adding tests for complex binary functions
FEAT: Adding identity function for all backends
BUG: Fixed problem in cast for OpenCL backend
BUG: Fixed a problem when casting complex numbers
FEAT: Adding diag for all backends
BUG: Fixed memory leak in C++ API when doing indexing
SubArrays now contain reference to shared_ptr instead of parent
Merge pull request #252 from shehzan10/devel
Fixing problems with isOwner() in all backends
Adding support for casting seq to array
Default constructor now creates array of size (0,0,0,0)
Minor changes to API
Merge pull request #258 from mcclanahoochie/osx_fixes
BUGFIX: Hotfix for cast in opencl backend
Adding proper checks to tests
Merge pull request #270 from umar456/clean_ocl_morph
Merge pull request #271 from umar456/osx_build
Fixing compilation errors
Merge pull request #276 from shehzan10/devel
adding math constants to ArrayFire
Changing api of few functions to match v2.1
BUG: Fixed issues with metadata while indexing
cleaning up bugs created by previous commit
Remove warnings when running fft in OpenCL backend
Initial commit wih gfor support
Merge pull request #285 from shehzan10/devel
Adding dimension checks for cplx2
Binary functions in C API now have batchMode parameter
binaryNode now accepts output dimension size
Adding support for batch mode in all backends
Merge remote-tracking branch 'upstream/devel' into gfor
Adding proper error checking macros to src/api/c/index.cpp
Adding batchFunc support for CPP bakend
FEAT: Adding GFOR support with for indexing
EXAMPLE: Adding vectorize example to arrayfire
Changing batchMode to batch
Merge pull request #303 from 9prady9/match_template
Cleaning up error handling in src/api/c/
Adding error messages when necessary for CPP API functions
Adding bounds checks for index and assign
Merge pull request #307 from shehzan10/devel
Merge pull request #311 from shehzan10/devel
Merge pull request #312 from 9prady9/perf_fixes
Merge pull request #317 from shehzan10/devel
Exposing ArrayFire OpenCL internals for interoperability
Fixing compile issues in OSX when using af/opencl.h
BUGFIX: Fixing GFOR bug during assign
Cleaning up the error checking in api/c/binary.cpp
BUGFIX: seq --> array inside GFOR creates batche array
BUGFIX: Fixed OpenCL JIT bug when variables were going out of scope
Fixing typo in ToNum()
Merge pull request #334 from pentschev/devel
Merge pull request #337 from pentschev/fix_windows_cuda_math
Merge pull request #335 from pentschev/devel
Fixing warnings in ORB implementation and tests
BUGFIX: Fixing data access patterns in OpenCL backend for diag
BUGFIX: Fixing data access patterns in OpenCL backend for identity
Fixing commit id for test/data
BUGFIX: dims() now gets dimensions properly after indexing
BUGFIX: Fixing issues with indexing after JIT operation
BUGFIX/FEAT: Adding support for more 4d indexing operations
FEAT: Adding support for negative offsets from end in CPP API
BUGFIX: Fixed memory leak in af_copy_array
Merge branch 'sobel' of https://github.com/9prady9/arrayfire into devel
Merge pull request #344 from pentschev/fix_windows_orb
BUGFIX: Fixing indexing to support reverse indexing
Merge pull request #354 from pentschev/orb_fixes
BUGFIX: Assignment operators now properly implement copy on write
TEST: Adding additional tests for CPP indexing
TEST: Adding new tests for CPP assign operators
FEAT: Added support for bitand, bitor and bitxor for all backends
FEAT: Adding preliminary support for 64 bit integers
FEAT: reorder, transpose, moddims support for 64 bit ints
FEAT: Adding binary function support for 64 bit ints
BUGFIX: for numeric operations on integer types in OpenCL backend
FEAT: CUDA backend support for numerical operations on 64 bit ints
BUGFIX: Enabling mod / rem for integer types
BUGFIX: Changing % to mean remainder instead of modulus
Cleaning up mod and rem for integer types
FEAT: Adding bitshiftl, bitshiftr
TEST: Adding tests for 64 bit ints and bit shift functions
Compile fix for windows
Merge pull request #360 from pentschev/fix_missing_deleter
BUGFIX: Adding target triple for when generating NVVM IR
Merge pull request #366 from pentschev/fix_fast_zerofeat
Merge pull request #368 from shehzan10/devel
Merge pull request #367 from arrayfire/cuda7
Removing math_ptx submodule as a dependency
Fixing dependency issues during ptx generation
Bugfix: fixed improper caching when casting in CUDA backend
Bugfix: fixed improper caching when casting in OpenCL backend
Merge remote-tracking branch 'upstream/devel' into ptxgen
Merge pull request #369 from shehzan10/devel
Changing std::string inputs to be references
Changing DeviceManager in OpenCL backend to use one context per device
Fixing copy paste error in sobel kernels in OpenCL backend
Cleaning up af::info for OpenCL backend
Sanitizing af::array class and constructor
BUG: Fixed problem with JIT caching in CUDA backend
BUG: Fixed problem with JIT caching in OpenCL backend
TEST: Adding priliminary test for JIT
STYLE: Removing unnecessary include files
Renaming tests in test/jit.cpp
Hashing the kernel names for CUDA and OpenCL
BUILD: auto generated PTX files are copied instead of renaming them
Use decimal notation instead of hex for OpenCL JIT names
BUGFIX: Enable double precision support properly in OpenCL backend
BUGFIX: Fixing randu for complex numbers in OpenCL backend
Cleaning up opencl/kernel/random.cl
FEAT: Adding support for randu(.., b8)
Disabling OpenCL CPU and Accelerator support for OSX
Adding skeleton code for indexed min and max
FEAT: Indexed min and max for CPU backend
FEAT: Indexed min and max for CUDA backend
Removing unnecessary files from OpenCL backend
FEAT: Indexed min and max for OpenCL backend
Reorganizing features.cpp
Adding proper checks in src/api/c/gradient.cpp
Bit operations now supported for scalar integers and bools
BUG: Fixed kernel compile issues with ireduce_dim.cl
BUG: Fixed typo in ireduce_dim.cl
STYLE: Fixed typos in test/reduce.cpp
TEST: Adding tests for indexed min and max
Fixing issues with min and max on boolean arrays
Merge pull request #387 from 9prady9/colorspace
Merge pull request #389 from 9prady9/statistics
FEAT: Adding flat for all backends
TEST: Adding tests for flat
Enable scalar(real, imag) in all backends
Changing overloaded createHandle appropriate function names
Moving AF_THROW(af_init()) inside try/catch blocks
af_constant_complex does not use temporary variables anymore
FEAT: constant(val,...) now accepts val from all types
TEST: Adding tests for constants of various types
Merge pull request #396 from 9prady9/histeq
Merge pull request #395 from shehzan10/devel
FEAT: Adding binary operations for each type
BUGFIX: memcopy kernel was creating indices incorrectly
Adding isLinear() to ArrayInfo
PERF: moddims no longer performs a copy if Input is Linear
Code clean up in FAST and ORB for all backends
Cleaning up the CPP features class
Cleaning up memory.cpp in cuda backend
Reverting a dumb commit I made to the code
Destroy af_array at the end of tests
Changing the internal API
Making assign exception safe
Destroying af_arrays properly in reduce and scan tests
Organizing the examples directory
Adding back examples from arrayfire_examples repo
FEAT: Adding gaussian kernel to all backends
Merge pull request #416 from shehzan10/devel
Merge remote-tracking branch 'upstream/devel' into examples
Merge pull request #408 from 9prady9/perf_conv
Merge remote-tracking branch 'upstream/devel' into examples
Changing the API of seprable convolution to match 2.1
Fixing the dimensions of separable convolution
Fixing dim checks for separable convolve in CUDA and OpenCL backends
Fixing convolve example
Fixing the rainfall example
FEAT: Adding "product" for all backends
FEAT: Adding flip for all backends
Enabling commented parts of integer.cpp and monte_carlo_options.cpp
Merge pull request #419 from 9prady9/sep_conv_fixes
Changing the order of dimensions for monte carlo example
Merge branch 'examples' of github.com:arrayfire/arrayfire into examples
BUGFIX: in moddims when input is a jit node
Merge pull request #423 from umar456/docs
Merge pull request #424 from umar456/gtest
Merge pull request #425 from shehzan10/devel
BUGFIX for cascaded indexing.
TEST: Adding cascaded indexing tests
TEST: Adding back commented out tests from flip
Merge pull request #427 from 9prady9/hsv_rgb
Merge branch 'devel' into docs
Merge pull request #428 from 9prady9/colorspace
Fixing path of arrayfire/assets
Build docs when you docs is enabled and "make all" is used
Merge pull request #430 from umar456/devel
FEAT: Adding lookup
Adding new instantiations for reductions
STYLE: Making the function "where" more explicit in C API
Changing the dimension checks for index in C APi
EXAMPLES: All machine learning examples now compile
BUGFIX: in ArrayIndex aka lookup for CUDA backend
Merge pull request #432 from 9prady9/conv_changes
BUGFIX, EXAMPLE, Fixing a mistake in mnist_common
Merge pull request #433 from 9prady9/ocl_fix
Adding deep belief net example to ArrayFire
Changing neural network example to use batches and epochs
Merge pull request #441 from 9prady9/lookup_fixes
EXAMPLE: Cleaning up DBN and ANN examples
Adding new functions matmulNT, matmulTN, matmulTT
Cleaning up DBN example to use new matmul functions
Adding RBM example for ArrayFire
Merge pull request #449 from shehzan10/devel
PERF: Break large JIT trees into smaller nodes
Fixing test names in complex.cpp
Merge pull request #456 from shehzan10/devel
STYLE: Changing cast operations in all backends
FEAT: filter in convolutions is cast to the accum type
BUILD: Adding /usr/local/include and /usr/include to FindOpenCL
Merge branch 'gtest-ninja' into devel
Merge pull request #472 from umar456/clang
Renaming logit to logistic_regression
BUGFIX: corrected the dimensions passed to gemv for tranpose(A)
BUGFIX: var and stdev now use the getFNSD from common.hpp
EXAMPLE: Cleaning up rbm example
Example: Naive bayes example now uses prior probabilities
Merge pull request #475 from bkloppenborg/cmake_install
BUILD :Changes to suppress warnings in tests
Example: clean up logistic regression
Example: Adding comments to naive bayes
Example: Adding new example to demo perceptron
PERF: Making the isLinear() to only look upto ndims()
PERF: Perform an async copy when data is linear
Merge pull request #491 from pentschev/example_harris
Removing OPENCL_LIBRARIES from CLBLAS_LIBRARIES in FindCLBLAS.cmake
Merge pull request #494 from bkloppenborg/cmake_packaging
Linear indexing now flattens the arrays before the operation
Changing the layout of the documentation
cleaning up the groups structure
Merge pull request #503 from shehzan10/devel
Removing empty file reduce.h
Minor tweaks to blas documentation
Added documentation for reductions
Adding doxygen briefs for image processing functions
Function groups organized
Adding the remaining documentation for functions in algorithm.h
Added documentation for part of arith.h
Merge pull request #507 from glehmann/assets-submodule-msg
DOCS: documentation for statistics.h
DOCS: Adding brief descriptions for all documented functions
DOCS: Adding documentation for remaining functions in image.h
DOCS: Remove src/api/c from header path
DOCS: Fixing code in getting_started
DOCS: Fixing the formatting in image.h
DOCS: Adding documentation for all functions in arith.h
Merge pull request #510 from 9prady9/signal_docs
DOCS: Fixing warnings
DOCS: Adding examples tab to the generated documentation
DOCS: Adding documentation for device.h and array.h
DOCS: Adding documentation for manip_mat in index.h
DOCS: Adding documentation for data.h
DOCS: Fixing documentation errors for arith functions
DOCS: Adding documentation for arith and logical operators in array.h
DOCS: Adding documentation for indexing operations
DOCS: Fixing links in the documentation landing page
DOCS: Adding download links for arrayfire
Peter Andreas Entschev (83):
Bug fix in OpenCL scan for Intel.
Added regions API and CUDA backend.
Added regions CPU backend as not supported.
Added regions OpenCL backend as not supported.
Added unit tests for regions.
Improved regions for CUDA, faster on large regions.
Merge branch 'master' into regions
Added OpenCL implementation of regions.
Added CPU implementation of regions.
Fixed template on CUDA regions.
Fixed limits of double type on CUDA backend.
Minor improvements to CPU regions.
Added enum for regions connectivity type.
Fixed regions unit tests
Added struct af_features to store image features (aka keypoints).
Added features class to manage af_features structs.
Added FAST feature detector frontend.
Added FAST feature detector CPU backend.
Added FAST feature detector CUDA backend.
Added FAST feature detector OpenCL backend.
Added handlers for array type in features class.
Fixed failing abs() call for int/unsigned types on CUDA backend of FAST.
Added test reader for image input with array output.
Added FAST unit tests.
Merge remote-tracking branch 'upstream/devel' into fast
Fixed FAST files to comply with new directory structure.
Fixed data filename on FAST unit test.
Updating test/data submodule
Fixed wrong memory type allocation on OpenCL backend of FAST.
FAST will return (af_)features instead of (af_)features *
Merge pull request #323 from pavanky/ocl
Changed CUDA convolve to avoid issues with constant memory.
Added ORB API.
Added ORB CPU backend.
Added ORB CUDA backend.
Changed thread variable names of some OpenCL functions.
Added ORB OpenCL backend.
Added test helper to read feature/descriptor test data.
Added ORB unit tests.
Added missing STL algorithm include to CUDA math.hpp.
Added pi definition to fix ORB on Windows.
Added check before freeing Gaussian filters in ORB OpenCL backend.
ORB to return empty arrays ORB when no features exist.
Merge pull request #355 from pavanky/index_fixes
Added missing shared_ptr deleter in OpenCL backend.
Added missing destructor for features class.
Fixed FAST C++ API, added proper destructor calls.
Fixed ORB C++ API to properly destroy af_features
Fixed FAST memory leaks on CPU backend
Fixed ORB memory leaks on CPU backend
Fixed FAST memory leaks on CUDA backend
Fixed ORB memory leaks on CUDA backend
Fixed FAST memory leaks on OpenCL backend
Fixed ORB memory leaks on OpenCL backend
Added missing memory deletions on FAST unit test.
Merge branch 'devel' into orb_fixes
Passing argument as reference to features operator=
Renamed feature.cpp to features.cpp to match class name
Fixed FAST CUDA backend case when no features are found
Fixed FAST CPU backend case when no features are found
Added image blur argument to ORB API
Added image blurring to ORB CPU backend
Added image blurring to ORB CUDA backend
Added image blurring to ORB OpenCL backend
Added image blurring argument to ORB unit tests
Improved ORB performance and memory usage on CUDA backend
Improved FAST performance on CUDA backend
Added argument to define length of edge discard in FAST.
Changed the way FAST handles different datatypes internally
Removed cudaMemset from FAST
Added documentation for FAST
Moved FAST description to docs directory.
Fixed FAST edge assertions
Added ORB documentation
Updated test data
Made FAST CPU results match CUDA results
Made FAST OpenCL results match CUDA results
Merge pull request #485 from pavanky/examples
Added Harris corner detector example
Fixed FAST type comparison mismatch warning
Merge pull request #500 from bkloppenborg/cmake_packaging
Merge pull request #504 from 9prady9/TemplateFunction
Merge pull request #509 from pavanky/docs
Pradeep (228):
af_transpose and corresponding unit tests
Style changes in transpose
Invalid arguments unit test for transpose
BUG Fix: af_print in CUDA backend was directly using device pointer.
CUDA backend transpose implementation
Changes to transpose kernel
changes to include all cl kernels in build
type fixes for opencl
af_print implementation for opencl backend
opencl buffer read/write fixes in Array
Added traits specilization for size_t
opencl backend implementation for af_transpose
Macro fix in transpose opencl backend
Reverted dim_type to long long
af_histogram cpu backend implementation
Changed readTests helper function to accept multiple input arrays
cpu implementations for af_[erode|dilate] and af_[erode3d|dilate3d]
Added readImageTests and compareArraysRMSD helpers for unit tests
af_bilateral API and cpu backend implementation
Adding missing namespace qualifiers
exp equation modification in bilateral cpu backend
BugFix: type issue fix in compareArraysRMSD
Disable unit test for int type in bilateral
morph cuda backend
cuda backend implementation for [af_erode3d|af_dilate3d]
erode/dilate unit tests using images
morph cuda kernel optimizations
opencl morph implementation
opencl backend for volumetric morphological ops
cuda backend bilateral
bilateral opencl backend implementation
morph kernel optimization for supported window sizes
histogram cuda backend
histogram opencl backend
Bug Fix in histogram cuda kernel
min call in histogram kernel was ambiguous for intel compilers
opencl device selection feature
Replaced member funcs with friend funcs in opencl::DeviceManager
added opencl kernel caching for transpose
Removed cl.hpp from af/opencl.h
Modified tranpose tests to run for all devices for opencl backend
enabled ocl kernel caching in transform
enabled ocl kernel caching for all exiting functions
style changes in ocl transpose
unify kernel params changes for opencl morph
Renamed CL_FINISH to CL_DEBUG_FINISH
unify kernel param changes to opencl bilateral
unify kernel param changes to opencl histogram
corrected typo in cpu_err header
Proper error handling added to transpose
Proper error handling added to erode/dilate
added error handling for bilateral
added error handling for histogram
median filter cpu backend and cuda/opencl placeholders
modified symmetric pad equation in medfilt cpu backend
median filter implementation in cuda backend
median filter opencl backend implementation
fft/ifft functions in cpu backend
fft framework changes
fft/ifft cuda backend
fft/ifft opencl backend
meanshift API and cpu backend
meanshift cuda backend implementation
meanshift opencl backend
createPaddedArray optimizations for cuda and opencl backend
BUGFIX: copy kernel
convolve cpu backend
convolve cuda backend
convolve opencl backend
renamed ConvolveBatchKind variables
Changed output array type for bilateral function
subscript assignment feature for cpu, cuda and opencl backends
cmake changes for opencl backend
C++ wrappers for functions, includes a bugfix as well
C++ wrappers for image and indexing functions
Merge branch 'cpp' of ssh://mule/area51/arrayfire into cpp
Bugfix: corrected array handle check in destructor
Added index support
Adding assign operator overloading in CPP
Merge branch 'origin/cpp' to cpp
Fixing copy assignment operator
convenience member functions for array indexing
changed separable convolve cpp API
changed gradient cpp API
additional unit tests for cpp wrapper
Bugfix: af_assign
Moved cpp wrapper functions to appropriate files
regions cpp wrapper
cpp wrapper unit tests
Merge fft & convolve headers
convolve API changes
Added new cpp wrappers for ffts
windows fixes for cuda backend
windows fixes for opencl backend
fix for google test build command
Visual Studio File Grouping for Projects
windows and *nix OS compatibility fixes
windows fixes for cpu backend
boost compute fixes for windows, had to undef min and max macros
Merge branch 'master' into win_fixes
undef min,max macros before boost/compute headers inclusion
Commenting out cpu blas funcions temporarily on windows
Additional fixes in cpu backend for windows platform
Additional cpp convenience functions for moddims
added compatibility APIs
Added NOMINMAX definition for windows platform
Removing PIC compiler flag for windows platform
Removing undef min, max as NOMINAX is added for windows
Merge remote-tracking branch 'origin/master' into ocl_win_fixes
Corrected gtest library path for debug mode
Added missing template specilizations for copy
Corrected visual studio link libraries for test build process
Merge remote-tracking branch 'upstream/devel' into ocl_win_fixes
Updated template specilizations for copy in cuda/cpu backends
Merge remote-tracking branch 'origin/ocl_win_fixes' into devel
add formatting to array print functions
Windows compatibility changes for BLAS on cpu backend
Merge branch 'devel' into win_cblas_fixes
windows compatibility fixes
style changes in cpu blas functions
typo corrections
changed dim_type typedef to int from long long
Fix for copy function in cpu backend
indexing unit tests for 3d and 4d arrays
bugfix for cuda on windows
changing variables names in reduce kernel for cuda backend
cmake changes for windows MSVC Projects
correcting test data commit number
correcting test data commit number
Changed setContext function scope
added isDoubleSupported func for opencl backend
Added double precision checks in opencl
function to check double precision availability
handle double precision in opencl tranpose
adding missing header in testHelper hpp
mean function
added getDevice internal function for opencl
modified buildProgram opencl helper function
moved ocl kernel resources from stack to heap
Moved cl_khr_fp64 extension
Removed cl_khr_fp64 from individual cl files
opencl device sorting
Merge branch 'devel' into statistics
2d convolve performance improvements
feature: indexing array using array
cuda backend for indexing array using array
opencl backend for indexing array using array
cpp wrapper for array based index
Removed indices size check in array-index
using 0 as default for dim to array index cpp wrapper
bugfix: fixes complex types for mean on cuda/opencl backend
Merge branch 'devel' into statistics
Merge branch 'devel' into statistics
feature: match template
cpp wrapper for match template
Corrected typo in median filter opencl kernel wrapper
bugfix: match template cpp unit test
Moved match template c api to apt location
changed shared mem access pattern for conv3d
Removed long long numeric qualifier for constants
perf: minor performance improvements for bilateral
Removed an obsolete condition in af_assign
Merge branch 'devel' into array_idx
perffix: 3d separable convolve
Merge branch 'devel' into statistics
feature: af_sobel_dxdy
af_sobel_dxdy CUDA backend
af_sobel_dxdy OpenCL backend
cpp wrapper for sobel derivatives
Changed c api for sobel operator
Merge branch 'devel' into statistics
Corrected test data hash tag
Multiple func definition fix for arith operations: mod and rem
BUGFIX: added same complex type cast noop
FEATURE: convience functions for weighted mean
FEATURE: variance
FEATURE: standard deviation
BUGFIX: added static qualifier for helper arithmetic functions
FEATURE: RGB to GRAY and vice versa color space convertion
FEATURE: covariance
Code cleanup for mean, var, stdev
FEATURE: median function
FEATURE: correlation coefficient function
BUGFIX: corrected scalar constant typo in median
type correction in median removes warnings
Merge branch 'devel' into statistics
Code cleanup mean, median, stdev
BUGFIX: windows fix for division helper function
BUGFIX: fixed multiple definition error for unaryName function
FEATURE: histogram equalization for images
BUGFIX: increased filter/mask length for convolve kernels
BUGFIX: modified default normalization factor
PERFFIX: convolution perf improved by 2-4%
PERFFIX: improved 2d convolve perf in cuda by 33%
Renamed separable conv cuda kernel file
Merge branch 'devel' into perf_conv
PERFFIX: improved opencl 2d convolution peformance by 4%
modified expand param to default to false for convolution
BUGFIX: 2d separable convolution
FEATURE: hsv to rgb and vice versa conversion functions
FEATURE: colorspace function
Reduced convolution compilation time
BUGFIX: added type check for tests on opencl backend
Adding copyright to examples
namespace fix in machine learning examples
Renamed af_array_index backend files to match new name af_lookup
Documentation for colorspace conversion functions
Documentation for histogram & histequal
Moved repeat function docs content to common location for image.h
Reuse unit tests to write documentation examples
Removed duplicate lines in mean & var tests
BUGFIX: fix in af_mean_all for cdouble type
Removed USE_SYSTEM_GTEST cmake option
BUGFIX: corrected conv2 filter length constant
Documented code related to 'How to add function to ArrayFire' wiki
Style and typo corrections in exampleFunction
Regions documentation and code example
Renamed image processing titles for morph & filters subgroups
Documentation for gaussian kernel functions
Documentation for Sobel Operator functions
Documentation for matchTemplate function
Documentation for medfilt function
Documentation for meanshift & bilateral functions
Documentation for Morphological Operator functions
Documentation for Convolution functions
Documentation for fft & ifft functions
Documentation for approx1 & approx2 functions
Documentation corrections
Pradeep Garigipati (43):
Read me redirection to repository wiki.
Basic contribution guidelines for pull requests
Merge pull request #83 from pavanky/readme
Adding unit tests related info
Merge pull request #146 from pavanky/gtest
Merge pull request #152 from shehzan10/devel
Merge pull request #162 from pavanky/reorg
Merge pull request #207 from pavanky/memory
Merge pull request #222 from shehzan10/devel
Merge pull request #226 from pavanky/cplx
Merge pull request #244 from pavanky/jit_fixes
Merge pull request #248 from shehzan10/devel
Merge pull request #256 from pavanky/iota
Merge pull request #257 from shehzan10/devel
Merge pull request #297 from shehzan10/devel
Merge pull request #315 from bkloppenborg/devel
Merge pull request #330 from pavanky/ocl_jit_fix
Merge pull request #375 from pavanky/jit_fixes
Merge pull request #379 from shehzan10/devel
Merge pull request #380 from shehzan10/devel
Merge pull request #383 from pavanky/clcontext
Merge pull request #384 from pavanky/random
Merge pull request #386 from pavanky/ireduce
Merge pull request #407 from arrayfire/memory
Merge pull request #415 from umar456/cxx_fix
Merge pull request #417 from pavanky/gausskern
Merge pull request #435 from bkloppenborg/warning_fix
Merge pull request #439 from bkloppenborg/array_indexing
Merge pull request #447 from pentschev/improve_orb_perf
Merge pull request #448 from pentschev/improve_fast_perf
Merge pull request #450 from pentschev/fast_edge
Merge pull request #455 from bkloppenborg/remove_unneeded_chars
Merge pull request #460 from bkloppenborg/get_non-zero_dims
Merge pull request #459 from ogreen/MRead
Merge pull request #463 from pavanky/minor_fixes
Merge pull request #462 from shehzan10/devel
Merge pull request #466 from pentschev/doc_fast
Merge pull request #471 from pentschev/doc_orb
Merge pull request #478 from pavanky/bug_fixes
Merge pull request #495 from pentschev/fix_fast_warning
Merge pull request #497 from shehzan10/devel
Merge pull request #502 from bkloppenborg/standalone_examples
Merge pull request #505 from pavanky/docs
Shehzan Mohammed (269):
Added af_diff1 function with cpu backend implementation.
Fixing typo in cuda/opencl placeholders for diff1
Added af_diff2 function with cpu backend implementation.
Added randu and randn functions to cpu
Added AF_<backend> definitions to test
Added CUDA backend for diff1 and diff2
Optimized diff to use just two kernels
Change launch configuration when inputs are just vectors
Added OpenCL backend for diff1 and diff2
Fixing ostream << operator for uchar to print numbers
Added image IO functions to all backends (code is independent of backend)
Removed flags from CMAKE for cuda build
Using static channel_split in imageio
Created image.h header file
Added CPU backend for resize
Added CUDA backend for resize
Added OpenCL backend for resize
Merge branch 'master' into resize
Updated OpenCL and CPU with offset changes
Updated diff CPU with offset changes
Code cleanup for Resize (all backends)
Large tests for resize. Minor type fixes for resize.
Added Transform and Rotate for CPU, CUDA and OpenCL backends
Merge branch 'master' into transform
Merge branch 'master' into transform
Added wrappers for translate, scale and skew. Added tests for rotate
Added helper functions to ArrayInfo
Using failure count for rotate tests. Minor type corrections in resize.
Added == and != operators for dim4
Added base_type to traits
Added Approx1 and Approx2 to all backends
Kernel code cleanup for Approx1,2 linear interp
Make random test deterministic
Using Params in cuda kernels
Added tile to CPU, CUDA, OpenCL backends
Performance improvement to tile in CUDA, OpenCL
Fix buildProgram multiple definition error
Unified kernel arguments for approx
Unified kernel arguments for diff
Unified kernel arguments for resize
Unified kernel arguments for transform
Unified kernel arguments for tile
Added error framework to approx
Added error framework to diff
Added error framework to resize
Added error framework to tile
Added error framework to transform
Device management for CUDA
Moved dimension checks for matmul to src/backend
Added reorder to all 3 backends
Added circular shift to all backends
Create af_create_handle wrapper for createEmptyArray
Added gradient to all backends
Added empty wrappers for sort
Added CUDA and OpenCL backends for Sort on dim0
Added multi-dimensional support for sort on dim 0
Fixed cudaGetDriverVersion for Mac and ARM
BUGFIX const correctness in ArrayInfo functions
Separated sort into sort (only values) and sort_index (values and indices)
Merge remote-tracking branch 'origin/master' into sort
Increase tolerance for rotate test
API change for sort
Fix for FindCBlas.cmake on debian based OS
Fixing blas uninitialzed warnings
Optimizing loops in sort
Added sort_by_key to all backends
Merge remote-tracking branch 'origin/master' into sort
Added Boost 1.48 requirement to OpenCL CMake (for compute)
Fixing warnings for ostream in opencl/kernel
BUGFIX for cuda random number generation on multiple devices
Update for device manager for OpenCL and CPU
Added test to print af_info
Fix for erroring out if Boost.Compute not found for OpenCL
Split sort* functions into separate files
Updated README.md
Separated functions into header files
Disable large sort tests
Added af_copy_array function for deep copy
Added helper functions to c++ wrapper
Added implementation for af::print
Added examples folder
Merge branch 'cpp' of mule:area51/arrayfire into cpp
Merge remote-tracking branch 'origin/master' into cpp
Added CPP wrappers for blas, device, data, reduce header files
Added Version.h functionality. Updated info()
Added c-api wrapper for weak copy
Updated operator= for array class
Added constant function to create complex arrays
Added wrapper for unary and binary operations
Added operator overloading for arithmetic and relational operations
Moved operator overloading to src/array/arith.cpp
Fixed af_print to print regular array (not transposed)
Added iota function to C and C++ API
Merge branch 'cpp' into 'master'
Fix for isnan, isinf compilation error
Overload sort C++ API
Change dim4 from struct to class
Added cmake code for CUDA Compute variables
Removed usage of ArrayInfo from C++ API
Merge pull request #40 from arrayfire/info_cpp_fixes
OSX Compilation Fixes
Using *operator instead of pow
Merge pull request #41 from arrayfire/osx_fixes
BUGFIX Fix sort on CPU
Added seq class for C++ API
Updating helloworld example
Merge pull request #44 from bkloppenborg/cuda_6_0_compile_fix
Make FreeImage library optional
Change FREEIMAGE_FOUND to WITH_FREEIMAGE
Added AF_ERR_NOT_CONFIGURED, added to imageio
Merge pull request #49 from arrayfire/freeimage-optional
Upgrading C++11 flag from 0x to 11
Added #if __cplusplus around utils
BUGFIX op= in array class
Added timing code
Fixed CPU random generator
Merge pull request #73 from kylelutz/reduce-header-includes
Fixed whitespaces, unsigned warning, removed printf
Merge pull request #80 from bkloppenborg/cmake_additional_install
Merge pull request #82 from bkloppenborg/cmake_additional_install
Update README.md with Jenkins Build tags
Updated README.md
Choose CUDA default device using AF_CUDA_DEFAULT_DEVICE
Choose OPENCL default device using AF_OPENCL_DEFAULT_DEVICE
Merge pull request #89 from arrayfire/default_device
Added round, floor, ceil and abs instances to CPU
Commit for CUDA PTX update fix for abs
Merge pull request #90 from pavanky/device
Merge pull request #93 from pavanky/bugfixes
Merge pull request #95 from pavanky/bugfixes
Merge pull request #100 from 9prady9/readme
Remove use of cmake variable CUDA_DRIVER_LIBRARY
Merge pull request #110 from bkloppenborg/add_cmake_find_script
Added option to not run CUDACmputeCheck.cmake
Formatting changes in cuda/CMakeLists.txt
Removing -DWINDOWS_REMOTE and just checking CUDA_COMPUTE_CAPABILITY
Improved performance of tranpose
Updated README.md with windows build tag
Update README.md
Remove cmake whitespace warning
Merge pull request #121 from 9prady9/ocl_perf_fix
Update README.md
Merge pull request #123 from pavanky/jit_fixes
Updated README.md
Increase tolerance for approx tests
Compilation fix for null pointer in constructor
Fix for get device ptr
Merge pull request #147 from pentschev/regions_tests_fix
Update submodules
Added deviceprop functionality to CUDA and OpenCL.
Changed REVISION to AF_REVISION
Merge pull request #153 from arrayfire/update_test_submodule
Update submodules
Added deviceprop functionality to CUDA and OpenCL.
Changed REVISION to AF_REVISION
Merge pull request #154 from pavanky/arith
Calling af_init to initialize contexts
Merge pull request #158 from bkloppenborg/platform_format_fix
Added linear interpolation for transform (adds for rotate, scale, skew etc)
Added tests for bilinear transforms (rotate)
Fixes for rotate bilinear tests
Removed device selection from transpose test
Use & operator for array in resize definition
Merge pull request #177 from pentschev/fast_opencl_fix
Merge pull request #180 from pavanky/perf
Fix cmake error caused by same filename in test and example
PERF Improvements to transform and rotate
Wrappers for resize
Add missing license headers to transform_interp
Added tests for iota
Added placeholders for join
Added CPU backend for join
Added CUDA backend for join
Added OpenCL backend for join
Added tests and linking test data for join
Better error handling in ImageIO
PERF Batching + Blocks images in rotate and transform
Fixed tests for imageio after error changes
Split sort_by_key instantiation into multiple files
Changed dir to isAscending in all sort functions
Change CMAKE_SOURCE_DIR/common to CMAKE_MODULE_PATH
Moved common to CMakeModules
Default arguments for medfilt
Change int to dim_type in cuda transpose
PERF Improvement to transpose opencl
Added conjugate option to transpose
Added transpose .T() and conjugate transpose .H() to array class
Changed minor version from .200 to .beta
Merge pull request #220 from pavanky/perf
Adding noDoubleTests condition to all tests
Compilation fix for cpu complex.hpp
Added pinned memory functionality
Added warning messages for when submodules are not cloned
Merge pull request #240 from pavanky/atan2
Merge pull request #241 from pavanky/test
Merge pull request #242 from pavanky/hypot
BUGFIX Fixed memory leak in image io, performance improvements
Merge pull request #246 from pavanky/new_funcs
Added dim checks for binary element wise ops
BUGFIX Fix segfault in copy array
Merge branch 'memory' of github.com:pavanky/arrayfire into devel
Fixed references to shared_ptr for cpu and opencl backend
Fixed compilation fix for identity
Correcting typo in FindFreeImage
Added memory manager for pinned memory
Using pinned memory in imageio
Changed API for iota
BUGFIX Fixed index-based array operators
Removing boost chrono required from opencl
Merge pull request #277 from pavanky/constants
Merge pull request #278 from pavanky/api
Merge pull request #279 from umar456/clang_warn
Merge pull request #283 from pavanky/minor_fixes
BUGFIX Reorder condition fix
BUGFIX for generating array from seq using negative step
Merge pull request #299 from pavanky/gfor
Compilation fix for Windows
Merge pull request #304 from pavanky/bug_fixes
Call init from pinned memory and load image
Added 4th dimension support to resize
Changed tiling block from y to x in rotate kernels
Added 4th dimension support to rotate
Merge pull request #310 from 9prady9/array_idx
Added 4th dimension support to transpose
BUGFIX in set device for cuda
BUGFIX Fixes seq generation for positive numbers
Merge branch 'devel' of github.com:arrayfire/arrayfire into devel
Merge pull request #316 from bkloppenborg/devel
Split sort_by_key instantiation into multiple files for CUDA backend
Moved sort_by_key instantiation files into directory
Merge pull request #345 from pavanky/more_fixes
Compilation fix for windows
Fixed clBLAS/clFFT libs install for Windows and OSX
Merge pull request #371 from pavanky/fixes
Updated README.md
Fix windows compile issue for opencl context handling
BUGFIX Fixed sobel output types
Sobel returns int for integer types instead of float
Merge pull request #382 from pavanky/ocl_build
Fixed backend API for join
Merge pull request #392 from pavanky/flat
Disable key testing for sort_index and sort_by_key for OpenCL
Merge pull request #399 from pavanky/copy_fixes
Merge pull request #401 from 9prady9/conv_fixes
Add pragma once to copy.hpp
Better backend API for iota (allow default argument for reps)
BUGFIX seq ops
Changed DIRECTORY to PATH in examples/CMakeList.txt
Add warning for not cloning gtest submodule
Remove GIT_SUBMODULES from build_gtest. Not supported on older Cmake
Merge pull request #438 from umar456/double_test
BUGFIX for indexing after JIT ops
Fixes to save image
Merge pull request #452 from pavanky/jit_fixes
Fixes and code optimization to join
Merge pull request #458 from 9prady9/win_fix
API Change iota to range
Update test data for orb
Merge pull request #465 from pentschev/remove_fast_memset
BUGFIX Fix windows is_same ambiguity
Merge pull request #481 from pentschev/match_fast_results
Merge pull request #490 from umar456/mnist
API Change order of data that range generates
Added operators for dim4
Added new functionality iota
Removing test/range. Will add back when corrected for new functionality
BUGFIX Fix offsets and strides when using moddims
Merge pull request #506 from bkloppenborg/example_fix
Merge pull request #511 from umar456/assets
Umar Arshad (101):
Initial Commit
FIX: Don't build test files if BUILD_TEST is OFF
Removed MESSAGE from opencl CMakeLists.txt
Fixed gcc47 parsing error
Removed tagged struct initilization to support older compilers
Fixed #1: Building OpenCL library fails on Linux
Fix #16: Patch command need to be used instead of svn patch
Merge branch '16' into 'master'
Merge branch 'data' into 'master'
Automatic download of arrayfire test data
forgot GetTestData.cmale
Only run tests whose binaries are being built
Fixed errors which popped up on Ubuntu related to pthreads
Merge branch 'ubuntu' into 'master'
Merge branch 'datatypes' into 'master'
Merge branch 'test_definition' into 'master'
Consistant naming for headers in src/backend
Cleanup rand functions. Remove macros
Instantiate Array distructors
Merge branch 'master' into rand
Cleanup. Improve readability in random.
Merge branch 'ocl_cmake' into 'master'
Merge branch 'random' into 'master'
Fix various things to get clang working on OSX
Merge branch 'master' into clang_fix
Compile on Linux
Replace operator overload with ToNum.
Merge branch 'fix_print_uchar' into 'master'
Merge branch 'simple_index' into 'master'
CPU GEMM, GEMV, and DOT
Tests for GEMM and GEMV
CUDA GEMM, GEMV and DOT
Merge branch 'cuda_reduce' into 'master'
Merge branch 'cpuscan' into 'master'
Merge branch 'testHelper_fix' into 'master'
BLAS on OpenCL using clBLAS library.
Cleanup CMake files(i.e. remove messages)
Changes to compile on Linux. Fix warnings on g++.
Merge branch 'clrand' into 'master'
A more robust FindCLBLAS.cmake file
Removed unnecessary code from tests.
Merge branch 'master' into blas
Formatting changes. Fix leak in CPU. Enable dot
Initial Error commit
Updated interface for error checking.
Created a common exception handling format
Merge pull request #140 from 9prady9/cuda_fix
Merge pull request #142 from 9prady9/msvc_filters
Fixes #167: Check if driver is unloaded when freeing array
Fix warnings in CPU backend on clang
Added the __ANSI_STRICT definition for OSX
Compile using g++ on OSX
Remove unnecessary instanciations of morph in OpenCL
Merge pull request #267 from mlloreda/macosx_rpath_fix
Fix g++ warnings on OSX
Fix linker warning on OSX
Fixed warning due to if/switch
Merge branch 'clang_fixes' into clang_warn
Merge pull request #327 from pavanky/bugfix
Merge pull request #328 from pavanky/gfor_seq
Merge pull request #340 from pavanky/bug_fixes
Merge pull request #356 from pavanky/math_funcs
Merge pull request #361 from pentschev/orb_fixes
Merge pull request #359 from jramapuram/devel
Merge pull request #370 from arrayfire/ptxgen
Initial documentation for ArrayFire 3.0
Simple description for array constructor and BLAS
Merge pull request #390 from shehzan10/devel
Fixed incorrect use of std::map::erase in OpenCL
Merge pull request #398 from pavanky/64bit
Merge pull request #402 from pavanky/cleanup
Merge pull request #405 from 9prady9/fft_fix
Remove C++11 conditional from src/api/c.
Moved assets folder/submodule to the root dir.
Added basic C interface functions.
Merge branch 'devel' into docs
Merge pull request #422 from arrayfire/examples
Use the chromium repository to build gtest.
Merge pull request #426 from pavanky/bug_fixes
Fix unused variable warnings in convolve_separable
Merge pull request #434 from pavanky/examples
Double precision checks in testing
Merge pull request #440 from pavanky/dbn
Merge pull request #442 from pentschev/orb_blurring
Merge pull request #446 from pavanky/rbm
Merge pull request #457 from pentschev/change_fast_datatype_internals
Merge pull request #443 from 9prady9/cspace_hist_docs
Faster DBN convergence. Test updates
Merge branch 'devel' of github.com:arrayfire/arrayfire into dbn_rand
Merge pull request #467 from pentschev/fix_fast_edge_assert
Make tests C++03 complient.
Merge branch 'devel' into clang
Remove additional c++11 features from test
Make tests using libstdc++ for clang builds on OSX
Removed messages an unnecessary functions
Merge pull request #484 from glehmann/fix-gtest-byproducts
Merge pull request #489 from pavanky/perf
Added display ASCII display function for MNIST.
Merge pull request #496 from pavanky/dox
Merge pull request #508 from 9prady9/image_docs
Fix ASSET_DIR path
chris (3):
opencl build program fixes for osx
optional double support for jit.cl
fix double support in jit.cl
easuter (2):
Use "Subversion_SVN_EXECUTABLE" explicitly instead of "svn patch".
Correction to commit e7ab4f6
firemanphil (1):
Fix "Could not read from remote repository" issue
greenman (2):
Fixed typo
Modified readme file with additional ArrayFire contact info
jramapuram (3):
Update CMakeLists.txt
allow for cuda7+ and backwards changes
comma to space
mlloreda (9):
Initial 1D sort implementation for cpu.
Sorting across first dimension
Merged origin sort to local
no more async; non-global indexing
test/sort.cpp: test on working types
src/backend/sort.cpp: disable s8 for sort
src/backend/sort.cpp: modify dim assertion for 2D
test/sort.cpp: 2D test
CMakeLists.txt: Update OSX RPath settings
ogreen (2):
Update README.md
Update README.md
orbitcowboy (1):
Fixed potential memory leaks. Each array allocated with new [n] must be deallocated using delete [].
pavan at arrayfire.com (3):
BUGFIX: Buffer nodes from subArrays now use the parent ptr and offsets
Tests use google test present on system if BUILD_GTEST=OFF
Adding a new option to use system GTEST
pradeep (3):
Moved helper functions to common source file to enable resuse
af_moddims function
Bugfix for opencl resource cleanup on windows
-----------------------------------------------------------------------
No new revisions were added by this update.
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/arrayfire.git
More information about the debian-science-commits
mailing list