[clblas] branch master updated (d16f7b3 -> 1f3de2a)
Ghislain Vaillant
ghisvail-guest at moszumanska.debian.org
Tue Jan 24 23:30:49 UTC 2017
This is an automated email from the git hooks/post-receive script.
ghisvail-guest pushed a change to branch master
in repository clblas.
from d16f7b3 Merge pull request #210 from TimmyLiu/master
new 1af16a8 Fixing issue with beta == 0 in AutoGemm kernels
new 969b5c6 Fixing integer divides to make clBLAS work when building with python3
new c41cc5d Trtri kernel build options were hard coded to 2.0
new c355d02 Merge pull request #202 from arrayfire/arrayfire-release-test
new 27653ce Merge pull request #209 from TimmyLiu/develop
new de196fe bump develop version number to 2.11.0
new a649bde avoid removing userGemmClKernels.cc with make clean
new aa637b6 Merge pull request #213 from TimmyLiu/develop
new 6041a3a fix some exception hanlers. now test-functional all pass
new 87bab9e Merge pull request #214 from TimmyLiu/develop
new 5ac6253 fix a hard coding bug
new 7385f68 put the numQueues to be 1
new 3ec45fd fix a bug in gflops count
new 627c654 Fixing issue with beta == 0 in UserGemm kernels
new 9c66a77 Fixing issues for when Beta == 0 in sgemm special cases
new d32081a Fix barriers in dtrsm specialized kernels
new ae53114 Merge pull request #221 from arrayfire/arrayfire-release-test
new 9d4c312 Protect pragma in preprocessor macro by using _Pragma. clang 3.7 will not allow compilation of the code otherwise (found on FreeBSD-CURRENT).
new e3c306a Merge pull request #216 from iotamudelta/develop
new c716d40 Only use the -m32 or -m64 compiler flags on x86.
new af30f4c Merge pull request #222 from dividiti/arm-support
new 1f63b34 Update .travis.yml and appveyor.yml
new 1c5ba46 Merge pull request #234 from haahh/update-travis
new 3e2c826 proposed fix for gemm thread safety; using thread-local storage for kernel map using pre-C++-11 syntax
new 1ab9efd re-submit after CI fix; removing dummy whitespace
new 2b56167 TLS for gcc 4.6
new 02cf387 fixing duplicate include and removing TODO note
new 7a74778 compiling kernels is now thread safe; not using global cl_kernel objects
new c590881 thread safety: no longer using global cl_kernel objects. thread safety is fixed pending customer verification
new 15548cf Merge pull request #235 from guacamoleo/develop
new ed8ee7e fix the compilation bug about c(z)dotc_
new c5d141d Merge pull request #242 from tingxingdong/test
new 2465662 fix the header accordingly
new 075dd8f Merge pull request #244 from tingxingdong/test
new 8491085 fixed compareMatrices to use GTEST_FLOAT_EQ
new 6df2f99 Merge pull request #245 from guacamoleo/develop
new 184bb07 fix error with missing KernelName variable
new 90ad0a2 Merge pull request #249 from hughperkins/missing-kernelname
new d103fee Add .pyc files to .gitignore
new f7c076b Merge pull request #254 from hughperkins/gitignore-add-pyc
new be56a61 Revert "fixed compareMatrices to use GTEST_FLOAT_EQ"
new 1e86e34 Adding detection for boost 1.60
new 5c0d759 Removed ::cerr wrt calling reference and clblas
new ac1854d Removing the printing of unit test parameters
new 2283077 Device selection for test-correctness and test-functional
new f682e98 Merge pull request #258 from kknox/unit-test-improvements
new da0fd1b Make installing source tree optional
new 7911b0e Merge pull request #252 from hughperkins/optional-install-src
new e0df18b Removing the pedantic flag from gcc compiles
new 96cae21 Commenting out further #pragma warning messages
new 162e779 Merge pull request #264 from kknox/fix-warnings
new d20977e Support for altivec on powerpc64 P8 systems (#262)
new 00a29c6 allow users to easily verify the gemm/trmm GPU results with the netlib cblas through client (#274)
new 8028868 fix #265 - spelling errors in comments and print statements (#276)
new c464ab9 Disable clang error on narrowing conversions.
new 11b0270 Merge pull request #281 from IvanVergiliev/narrow-conversions
new 1775a50 Point the CONTRIBUTING wiki links to the correct repository
new 9279831 Merge pull request #282 from IvanVergiliev/fix-wiki-links-develop
new 0a8a4fa add missing dependency to pthread (using rwlock functions)
new b96c1a0 Merge pull request #283 from exmakhina/develop
new 0e0c95c x offset stored in offb, not offa, determines vectorization
new 53d25ef syr2: Y uses incy, not incx
new 199b7c0 Merge pull request #290 from mgates3/develop
new 69d38d9 Adding additional trsm samples
new a71aa63 Bump version to 2.12.0
new b567cd4 Update README with release notes
new 88afc1d Merge branch 'master' into 2.12
new 1f3de2a Merge pull request #295 from kknox/2.12
The 68 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "adds" were already present in the repository and have only
been added to this reference.
Summary of changes:
.gitignore | 4 +-
.travis.yml | 16 +-
CONTRIBUTING.md | 4 +-
README.md | 19 +-
appveyor.yml | 17 +-
src/CMakeLists.txt | 79 +-
src/FindNetlib.cmake | 19 +
src/FindOpenCL.cmake | 143 ++-
src/clBLAS.h | 2 +-
src/client/CMakeLists.txt | 9 +-
src/client/clfunc_common.hpp | 100 ++-
src/client/clfunc_xgemm.hpp | 579 +++++++-----
src/client/clfunc_xsyrk.hpp | 6 +-
src/client/clfunc_xtrmm.hpp | 266 ++++--
src/client/client.cpp | 967 +++++++++++----------
src/include/defbool.h | 5 +
src/include/kerngen.h | 2 +-
src/library/CMakeLists.txt | 58 +-
src/library/blas/AutoGemm/Includes.py | 8 +-
src/library/blas/AutoGemm/KernelOpenCL.py | 12 +-
src/library/blas/AutoGemm/KernelParameters.py | 4 +-
.../dgemm_Col_NN_B0_MX048_NX048_KX08_src.cpp | 4 +-
.../dgemm_Col_NN_B1_MX048_NX048_KX08_src.cpp | 4 +-
.../dgemm_Col_NT_B0_MX048_NX048_KX08_src.cpp | 4 +-
.../dgemm_Col_NT_B1_MX048_NX048_KX08_src.cpp | 4 +-
.../dgemm_Col_TN_B0_MX048_NX048_KX08_src.cpp | 4 +-
.../dgemm_Col_TN_B1_MX048_NX048_KX08_src.cpp | 4 +-
.../sgemm_Col_NN_B0_MX032_NX032_KX16_src.cpp | 2 +-
.../sgemm_Col_NN_B0_MX064_NX064_KX16_src.cpp | 2 +-
.../sgemm_Col_NN_B0_MX096_NX096_KX16_src.cpp | 74 +-
...sgemm_Col_NN_B1_MX032_NX032_KX16_BRANCH_src.cpp | 72 +-
.../sgemm_Col_NN_B1_MX032_NX032_KX16_src.cpp | 4 +-
.../sgemm_Col_NN_B1_MX064_NX064_KX16_src.cpp | 2 +-
.../sgemm_Col_NN_B1_MX096_NX096_KX16_src.cpp | 2 +-
.../sgemm_Col_NT_B0_MX032_NX032_KX16_src.cpp | 2 +-
.../sgemm_Col_NT_B0_MX064_NX064_KX16_src.cpp | 2 +-
.../sgemm_Col_NT_B0_MX096_NX096_KX16_src.cpp | 74 +-
...sgemm_Col_NT_B1_MX032_NX032_KX16_BRANCH_src.cpp | 70 +-
...sgemm_Col_NT_B1_MX032_NX032_KX16_SINGLE_src.cpp | 35 +-
.../sgemm_Col_NT_B1_MX032_NX032_KX16_src.cpp | 2 +-
.../sgemm_Col_NT_B1_MX032_NX064_KX16_ROW_src.cpp | 22 +-
.../sgemm_Col_NT_B1_MX064_NX032_KX16_COL_src.cpp | 20 +-
.../sgemm_Col_NT_B1_MX064_NX064_KX16_src.cpp | 2 +-
.../sgemm_Col_NT_B1_MX096_NX096_KX16_src.cpp | 2 +-
.../sgemm_Col_NT_B1_MX128_NX128_KX16_src.cpp | 6 +-
.../sgemm_Col_TN_B0_MX032_NX032_KX16_src.cpp | 2 +-
.../sgemm_Col_TN_B0_MX064_NX064_KX16_src.cpp | 2 +-
.../sgemm_Col_TN_B0_MX096_NX096_KX16_src.cpp | 74 +-
...sgemm_Col_TN_B1_MX032_NX032_KX16_BRANCH_src.cpp | 60 +-
.../sgemm_Col_TN_B1_MX032_NX032_KX16_src.cpp | 2 +-
.../sgemm_Col_TN_B1_MX064_NX064_KX16_src.cpp | 2 +-
.../sgemm_Col_TN_B1_MX096_NX096_KX16_src.cpp | 2 +-
src/library/blas/functor/functor_xscal.cc | 2 +-
src/library/blas/generic/solution_seq_make.c | 6 +-
src/library/blas/gens/asum.cpp | 6 +-
src/library/blas/gens/axpy_reg.cpp | 8 +-
src/library/blas/gens/clTemplates/gemm.cl | 8 +-
src/library/blas/gens/clTemplates/her2.cl | 8 +-
src/library/blas/gens/clTemplates/symm.cl | 36 +-
src/library/blas/gens/clTemplates/syr2.cl | 6 +-
src/library/blas/gens/clTemplates/syr2_her2.cl | 8 +-
src/library/blas/gens/clTemplates/trmv.cl | 4 +-
src/library/blas/gens/copy_reg.cpp | 8 +-
src/library/blas/gens/dot.cpp | 8 +-
src/library/blas/gens/gbmv.cpp | 4 +-
src/library/blas/gens/gemv.c | 10 +-
src/library/blas/gens/ger_lds.cpp | 4 +-
src/library/blas/gens/her2_lds.cpp | 4 +-
src/library/blas/gens/her_lds.cpp | 2 +-
src/library/blas/gens/iamax.cpp | 8 +-
src/library/blas/gens/kprintf.cpp | 4 +-
src/library/blas/gens/nrm2.cpp | 6 +-
src/library/blas/gens/rotm_reg.cpp | 8 +-
src/library/blas/gens/scal_reg.cpp | 4 +-
src/library/blas/gens/swap_reg.cpp | 8 +-
src/library/blas/gens/symv.c | 10 +-
src/library/blas/gens/syr2_lds.cpp | 4 +-
src/library/blas/gens/syr_lds.cpp | 2 +-
src/library/blas/gens/trmm.c | 2 +-
src/library/blas/gens/trmv_reg.cpp | 4 +-
src/library/blas/gens/trsv_gemv.cpp | 4 +-
src/library/blas/gens/trsv_trtri.cpp | 2 +-
src/library/blas/include/clblas-internal.h | 2 +-
src/library/blas/include/kprintf.hpp | 2 +-
src/library/blas/ixamax.c | 2 +-
src/library/blas/trtri/TrtriKernelSourceIncludes.h | 6 +-
.../blas/trtri/diag_dtrtri_lower_128_16.cpp | 3 +-
.../blas/trtri/diag_dtrtri_upper_128_16.cpp | 17 +-
.../blas/trtri/diag_dtrtri_upper_192_12.cpp | 11 +-
.../trtri/triple_dgemm_update_128_16_PART1_L.cpp | 9 +-
.../trtri/triple_dgemm_update_128_16_PART2_L.cpp | 1 -
.../blas/trtri/triple_dgemm_update_128_16_R.cpp | 11 +-
.../trtri/triple_dgemm_update_128_32_PART1_L.cpp | 7 +-
.../trtri/triple_dgemm_update_128_32_PART1_R.cpp | 7 +-
.../trtri/triple_dgemm_update_128_32_PART2_L.cpp | 1 -
.../trtri/triple_dgemm_update_128_32_PART2_R.cpp | 1 -
.../trtri/triple_dgemm_update_128_64_PART1_L.cpp | 7 +-
.../trtri/triple_dgemm_update_128_64_PART1_R.cpp | 5 +-
.../trtri/triple_dgemm_update_128_64_PART2_L.cpp | 1 -
.../trtri/triple_dgemm_update_128_64_PART2_R.cpp | 1 -
.../triple_dgemm_update_128_ABOVE64_PART1_L.cpp | 7 +-
.../triple_dgemm_update_128_ABOVE64_PART1_R.cpp | 5 +-
.../triple_dgemm_update_128_ABOVE64_PART2_L.cpp | 1 -
.../triple_dgemm_update_128_ABOVE64_PART2_R.cpp | 1 -
.../triple_dgemm_update_128_ABOVE64_PART3_L.cpp | 1 -
.../triple_dgemm_update_128_ABOVE64_PART3_R.cpp | 1 -
.../blas/trtri/triple_dgemm_update_192_12_R.cpp | 5 +-
.../trtri/triple_dgemm_update_192_24_PART1_R.cpp | 1 -
.../trtri/triple_dgemm_update_192_24_PART2_R.cpp | 1 -
.../trtri/triple_dgemm_update_192_48_PART1_R.cpp | 3 +-
.../trtri/triple_dgemm_update_192_48_PART2_R.cpp | 1 -
.../trtri/triple_dgemm_update_192_96_PART1_R.cpp | 3 +-
.../trtri/triple_dgemm_update_192_96_PART2_R.cpp | 1 -
src/library/blas/xasum.c | 2 +-
src/library/blas/xaxpy.c | 4 +-
src/library/blas/xcopy.c | 4 +-
src/library/blas/xdot.c | 4 +-
src/library/blas/xgbmv.c | 4 +-
src/library/blas/xgemm.cc | 275 ++++--
src/library/blas/xgemv.c | 4 +-
src/library/blas/xger.c | 4 +-
src/library/blas/xhemv.c | 4 +-
src/library/blas/xher.c | 2 +-
src/library/blas/xher2.c | 8 +-
src/library/blas/xhpmv.c | 4 +-
src/library/blas/xnrm2.c | 2 +-
src/library/blas/xrot.c | 4 +-
src/library/blas/xrotm.c | 4 +-
src/library/blas/xscal.c | 2 +-
src/library/blas/xshbmv.c | 4 +-
src/library/blas/xspmv.c | 4 +-
src/library/blas/xswap.c | 4 +-
src/library/blas/xsymv.c | 4 +-
src/library/blas/xsyr.c | 2 +-
src/library/blas/xsyr2.c | 4 +-
src/library/blas/xtbmv.c | 4 +-
src/library/blas/xtbsv.c | 44 +-
src/library/blas/xtrmv.c | 4 +-
src/library/blas/xtrsm.cc | 7 +-
src/library/blas/xtrsv.c | 6 +-
src/library/common/kerngen_core.c | 2 +-
src/library/tools/ktest/config.cpp | 4 +-
src/library/tools/ktest/steps/gemv.cpp | 12 +-
src/library/tools/ktest/steps/symv.cpp | 12 +-
src/samples/CMakeLists.txt | 34 +-
src/samples/{example_strsm.c => example_ctrsm.c} | 38 +-
src/samples/{example_strsm.c => example_strsm.cpp} | 85 +-
src/tests/BlasBase.cpp | 94 +-
src/tests/CMakeLists.txt | 26 +-
src/tests/cmdline.c | 43 +-
src/tests/correctness/blas-lapack.c | 8 +-
src/tests/correctness/blas-lapack.h | 8 +-
src/tests/correctness/corr-asum.cpp | 17 +-
src/tests/correctness/corr-axpy.cpp | 19 +-
src/tests/correctness/corr-copy.cpp | 18 +-
src/tests/correctness/corr-dot.cpp | 17 +-
src/tests/correctness/corr-dotc.cpp | 17 +-
src/tests/correctness/corr-gbmv.cpp | 19 +-
src/tests/correctness/corr-gemm.cpp | 19 +-
src/tests/correctness/corr-gemm2.cpp | 16 +-
src/tests/correctness/corr-gemv.cpp | 18 +-
src/tests/correctness/corr-ger.cpp | 21 +-
src/tests/correctness/corr-gerc.cpp | 21 +-
src/tests/correctness/corr-hbmv.cpp | 19 +-
src/tests/correctness/corr-hemm.cpp | 18 +-
src/tests/correctness/corr-hemv.cpp | 19 +-
src/tests/correctness/corr-her.cpp | 18 +-
src/tests/correctness/corr-her2.cpp | 18 +-
src/tests/correctness/corr-her2k.cpp | 15 +-
src/tests/correctness/corr-herk.cpp | 15 +-
src/tests/correctness/corr-hpmv.cpp | 10 -
src/tests/correctness/corr-hpr.cpp | 10 -
src/tests/correctness/corr-hpr2.cpp | 9 -
src/tests/correctness/corr-iamax.cpp | 16 +-
src/tests/correctness/corr-nrm2.cpp | 16 +-
src/tests/correctness/corr-rot.cpp | 18 +-
src/tests/correctness/corr-rotg.cpp | 34 +-
src/tests/correctness/corr-rotm.cpp | 17 +-
src/tests/correctness/corr-rotmg.cpp | 16 +-
src/tests/correctness/corr-sbmv.cpp | 19 +-
src/tests/correctness/corr-scal.cpp | 16 +-
src/tests/correctness/corr-spmv.cpp | 19 +-
src/tests/correctness/corr-spr.cpp | 12 -
src/tests/correctness/corr-spr2.cpp | 9 -
src/tests/correctness/corr-swap.cpp | 19 +-
src/tests/correctness/corr-symm.cpp | 17 +-
src/tests/correctness/corr-symv.cpp | 19 +-
src/tests/correctness/corr-syr.cpp | 21 +-
src/tests/correctness/corr-syr2.cpp | 18 +-
src/tests/correctness/corr-syr2k.cpp | 17 +-
src/tests/correctness/corr-syrk.cpp | 17 +-
src/tests/correctness/corr-tbmv.cpp | 18 +-
src/tests/correctness/corr-tbsv.cpp | 18 +-
src/tests/correctness/corr-tpmv.cpp | 11 -
src/tests/correctness/corr-tpsv.cpp | 9 -
src/tests/correctness/corr-trmm.cpp | 18 +-
src/tests/correctness/corr-trmv.cpp | 20 +-
src/tests/correctness/corr-trsm.cpp | 22 +-
src/tests/correctness/corr-trsv.cpp | 17 +-
src/tests/correctness/test-correctness.cpp | 102 +--
src/tests/functional/func-error.cpp | 4 +-
src/tests/functional/test-functional.cpp | 100 ++-
src/tests/include/BlasBase.h | 11 +-
src/tests/include/asum.h | 4 -
src/tests/include/axpy.h | 3 -
src/tests/include/cmdline.h | 6 +-
src/tests/include/copy.h | 3 -
src/tests/include/dot.h | 4 -
src/tests/include/dotc.h | 4 -
src/tests/include/gbmv.h | 5 -
src/tests/include/gemm-2.h | 6 -
src/tests/include/gemm.h | 5 -
src/tests/include/gemv.h | 5 -
src/tests/include/ger.h | 8 -
src/tests/include/gerc.h | 6 -
src/tests/include/hbmv.h | 5 -
src/tests/include/hemm.h | 5 -
src/tests/include/hemv.h | 5 -
src/tests/include/her.h | 6 -
src/tests/include/her2.h | 5 -
src/tests/include/her2k.h | 5 -
src/tests/include/herk.h | 5 -
src/tests/include/iamax.h | 3 -
src/tests/include/matrix.h | 118 +--
src/tests/include/nrm2.h | 4 -
src/tests/include/rot.h | 3 -
src/tests/include/rotg.h | 3 -
src/tests/include/rotm.h | 3 -
src/tests/include/rotmg.h | 3 -
src/tests/include/sbmv.h | 5 -
src/tests/include/scal.h | 4 -
src/tests/include/spmv.h | 5 -
src/tests/include/swap.h | 3 -
src/tests/include/symm.h | 5 -
src/tests/include/symv.h | 5 -
src/tests/include/syr.h | 5 -
src/tests/include/syr2.h | 5 -
src/tests/include/syr2k.h | 5 -
src/tests/include/syrk.h | 5 -
src/tests/include/tbmv.h | 5 -
src/tests/include/tbsv.h | 5 -
src/tests/include/trmm.h | 5 -
src/tests/include/trmv.h | 4 -
src/tests/include/trsm.h | 5 -
src/tests/include/trsv.h | 4 -
245 files changed, 2804 insertions(+), 2515 deletions(-)
copy src/samples/{example_strsm.c => example_ctrsm.c} (82%)
copy src/samples/{example_strsm.c => example_strsm.cpp} (66%)
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/clblas.git
More information about the debian-science-commits
mailing list