[clblas] branch debian/sid updated (c531d41 -> 516a02e)
Ghislain Vaillant
ghisvail-guest at moszumanska.debian.org
Tue Aug 4 15:35:42 UTC 2015
This is an automated email from the git hooks/post-receive script.
ghisvail-guest pushed a change to branch debian/sid
in repository clblas.
from c531d41 d/changelog: release to unstable
adds 434b38e enable offline compilation of a subset of GEMM and TRSM on targeted device
adds 2dce4f5 minor bug fix
adds 595c63b fix bug for small matrix when beta is 0
adds 5c3f082 minor bug fix in client code
adds 8dc95f9 Merge pull request #81 from TimmyLiu/develop
adds d00b59a do not build bingen if offline compilation is disabled
adds 38b342a Merge pull request #82 from TimmyLiu/develop
adds 1795886 correctness fix
adds eff87f9 fix travis CI build
adds 0a6d431 Merge pull request #85 from TimmyLiu/develop
adds a55d3ae Merge branch 'develop' of https://github.com/clMathLibraries/clBLAS into develop
adds fda48a7 replacing barrier with memfence in the inner most loop requires an extra barrier at the beginning of the outer loop.
adds 39b324d improve big sgemm column NN perf. improve small sgemm NN perf.
adds f9e0160 Merge pull request #87 from TimmyLiu/develop
adds 413819f bump develop version to 2.5
adds fdcf987 Merge pull request #88 from TimmyLiu/develop
adds 8ef0a43 some static kernel code clean up
adds a280c96 improve sgemm column major TN small matrix perf. some type/bug fixes
adds 5137231 Merge pull request #90 from TimmyLiu/develop
adds 93b5b69 fix a very silly bug in compuing s/dtrsm flops.
adds 8b41d5e Merge pull request #91 from TimmyLiu/develop
adds c084b47 Ben : fixing bonaire path for sgemm using CL2.0 path
adds 2ad3664 fixing a typo
adds aa972ec chanching the heuristic to detect the small matrices
adds d4163f4 Merge pull request #93 from BenjaminCoquelle/develop
adds 7302f86 some typo fixes
adds 573b487 Merge pull request #95 from TimmyLiu/develop
adds 1972170 Fix install location of samples
adds 9edf929 Merge pull request #75 from marbre/samples
adds d8419d8 Install scripts/perf to share/clBLAS on non WIN32 systems
adds f8af95c Merge pull request #74 from marbre/develop
adds 2f845e2 fix cmake bug introduced by pull request #75
adds 17b22e8 Merge pull request #96 from TimmyLiu/develop
adds 46389ac added test for OSX detection to turn off CORR_TEST_WITH_ACML, refactored CMakeLists.txt in BUILD_TEST block
adds f5d5adc Merge pull request #99 from lzamparo/cmake_fix
adds 6d1e3c4 stop checking opencl major number in some routines
adds f4af838 better handle sgemm NT where M and N are mod32 and not mod64. M and N are within range from 1184 to 3872
adds 4447bfe Merge pull request #100 from TimmyLiu/develop
adds 701210c fix undefined reference to symbol 'pthread_key_delete@@GLIBC_2.2.5'
adds 1136350 Merge pull request #102 from lunochod/develop
adds 60092c2 delete appendix in license file
adds 2621814 Merge pull request #106 from TimmyLiu/develop
adds b83750a Install cmake configuration to lib/cmake/clBLAS
adds 77b3245 Merge pull request #105 from marbre/develop
adds 6623809 adding zgemm kernel for hawaii
adds 8580cdb fixed including gcn_zgemm.h
adds 6f476b8 Merge pull request #107 from guacamoleo/develop
adds bd13b7b enables apiCallCount for zgemm within client
adds 03ae187 fixed zgemm offset bug; removed profiling from client
adds f9a2250 Merge pull request #111 from guacamoleo/develop
adds f7c6536 add codepath for dtrsm when M and N are mod192
adds 828aff1 Merge pull request #112 from TimmyLiu/develop
adds 262a1e1 add x86_64/sdk suffix as search location for libOpenCL.so when AMDAPPSDKROOT is used
adds 2137cae Merge pull request #113 from lunochod/develop
adds 5b922a7 python scripts should call clBLAS-client instead of client
adds f3471bf Merge pull request #116 from TimmyLiu/develop
adds 6311c6b adding performance data
adds e058f67 fixed graph script
adds 5005205 Merge pull request #118 from guacamoleo/develop
adds 3f032e7 merge develop branch to master branch. Bump master branch version number to 2.6
adds 9731ea2 Merge pull request #119 from TimmyLiu/master
new 8be809d Merge tag 'upstream/v2.6' into debian/experimental
new 4a6859c d/changelog: bump dversion, switch to unreleased
new 9be369f d/p: refresh patches
new b42faf3 d/rules: update cmake build options
new a71e799 d/p: add patch fixing missing pthread linkage
new bcc1339 d/p: break doxygen patch down to more specific patches
new 516a02e d/changelog: release to unstable
The 7 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails. The revisions
listed as "adds" were already present in the repository and have only
been added to this reference.
Summary of changes:
.gitignore | 3 +
.travis.yml | 22 +-
LICENSE | 25 -
README.md | 65 +-
debian/changelog | 11 +
debian/patches/debian-enable-multiarch.patch | 6 +-
debian/patches/disable-multilib-cflags.patch | 6 +-
debian/patches/fix-docs-output.patch | 18 +
debian/patches/fix-doxygen-settings.patch | 34 -
debian/patches/fix-pthread-linkage.patch | 18 +
debian/patches/reproducible-build.patch | 17 +
debian/patches/series | 5 +-
debian/patches/use-system-mathjax.patch | 16 +
debian/rules | 3 +-
doc/README-BinaryCacheOnDisk.txt | 69 +
doc/README-FunctorConcepts.txt | 100 +
doc/README-HowToIntroduceFunctors.txt | 402 ++
doc/README-TransformASolverIntoAFunctor.txt | 382 ++
doc/performance/clBLAS_2.6.0/S9150/README.txt | 35 +
doc/performance/clBLAS_2.6.0/S9150/dgemm_32.csv | 181 +
doc/performance/clBLAS_2.6.0/S9150/dgemm_96.csv | 61 +
doc/performance/clBLAS_2.6.0/S9150/dtrsm_192.csv | 31 +
.../clBLAS_2.6.0/S9150/generate_graphs.sh | 92 +
doc/performance/clBLAS_2.6.0/S9150/peak_dp.csv | 181 +
doc/performance/clBLAS_2.6.0/S9150/peak_sp.csv | 181 +
doc/performance/clBLAS_2.6.0/S9150/sgemm_32.csv | 181 +
doc/performance/clBLAS_2.6.0/S9150/zgemm_32.csv | 181 +
doc/performance/clBLAS_2.6.0/S9150/zgemm_64.csv | 91 +
doc/performance/cuBLAS_7.0/Tesla_K40/README.txt | 35 +
doc/performance/cuBLAS_7.0/Tesla_K40/dgemm.csv | 181 +
doc/performance/cuBLAS_7.0/Tesla_K40/dtrsm.csv | 31 +
doc/performance/cuBLAS_7.0/Tesla_K40/peak_dp.csv | 181 +
doc/performance/cuBLAS_7.0/Tesla_K40/peak_sp.csv | 181 +
doc/performance/cuBLAS_7.0/Tesla_K40/sgemm.csv | 181 +
doc/performance/cuBLAS_7.0/Tesla_K40/zgemm.csv | 181 +
src/CMakeLists.txt | 83 +-
src/FindOpenCL.cmake | 3 +-
src/clBLAS.def | 28 +
src/clBLAS.h | 622 ++
src/client/clfunc_common.hpp | 1 +
src/client/clfunc_xgemm.hpp | 53 +-
src/client/clfunc_xtrsm.hpp | 14 +-
src/client/client.cpp | 21 +-
src/flags_public.txt | 4 +
src/include/binary_lookup.h | 273 +
src/include/devinfo.h | 2 +
src/include/md5sum.h | 50 +
src/include/rwlock.h | 117 +
src/library/CMakeLists.txt | 282 +-
src/library/bingen.cmake | 144 +
src/library/blas/fill.cc | 272 +
src/library/blas/functor/bonaire.cc | 90 +
src/library/blas/functor/functor.cc | 117 +
src/library/blas/functor/functor_fill.cc | 156 +
src/library/blas/functor/functor_selector.cc | 344 ++
src/library/blas/functor/functor_xgemm.cc | 323 +
src/library/blas/functor/functor_xscal.cc | 410 ++
src/library/blas/functor/functor_xscal_generic.cc | 439 ++
src/library/blas/functor/functor_xtrsm.cc | 336 ++
src/library/blas/functor/gcn_dgemm.cc | 1035 ++++
src/library/blas/functor/gcn_dgemmCommon.cc | 997 +++
src/library/blas/functor/gcn_dgemmSmallMatrices.cc | 654 ++
src/library/blas/functor/gcn_sgemm.cc | 556 ++
src/library/blas/functor/gcn_sgemmSmallMatrices.cc | 558 ++
src/library/blas/functor/gcn_zgemm.cc | 354 ++
src/library/blas/functor/gpu_dtrsm.cc | 823 +++
src/library/blas/functor/gpu_dtrsm192.cc | 596 ++
src/library/blas/functor/hawaii.cc | 223 +
.../blas/functor/hawaii_dgemmChannelConflict.cc | 159 +
.../blas/functor/hawaii_dgemmSplitKernel.cc | 670 ++
.../blas/functor/hawaii_sgemmBranchKernel.cc | 442 ++
src/library/blas/functor/hawaii_sgemmSplit64_32.cc | 423 ++
.../blas/functor/hawaii_sgemmSplitKernel.cc | 858 +++
src/library/blas/functor/include/BinaryBuild.h | 10 +
src/library/blas/functor/include/atomic_counter.h | 173 +
src/library/blas/functor/include/bonaire.h | 41 +
src/library/blas/functor/include/functor.h | 496 ++
src/library/blas/functor/include/functor_fill.h | 99 +
.../functor/include/functor_hawaii_dgemm_NT_MN48.h | 210 +
.../blas/functor/include/functor_selector.h | 149 +
src/library/blas/functor/include/functor_utils.h | 116 +
src/library/blas/functor/include/functor_xgemm.h | 213 +
src/library/blas/functor/include/functor_xscal.h | 207 +
.../blas/functor/include/functor_xscal_generic.h | 173 +
src/library/blas/functor/include/functor_xtrsm.h | 203 +
src/library/blas/functor/include/gcn_dgemm.h | 59 +
src/library/blas/functor/include/gcn_dgemmCommon.h | 22 +
.../blas/functor/include/gcn_dgemmSmallMatrices.h | 27 +
src/library/blas/functor/include/gcn_sgemm.h | 62 +
.../blas/functor/include/gcn_sgemmSmallMatrices.h | 27 +
src/library/blas/functor/include/gcn_zgemm.h | 62 +
src/library/blas/functor/include/gpu_dtrsm.h | 28 +
src/library/blas/functor/include/gpu_dtrsm192.h | 28 +
src/library/blas/functor/include/hawaii.h | 42 +
.../functor/include/hawaii_dgemmChannelConflict.h | 22 +
.../blas/functor/include/hawaii_dgemmSplitKernel.h | 46 +
.../functor/include/hawaii_sgemmBranchKernel.h | 50 +
.../blas/functor/include/hawaii_sgemmSplit64_32.h | 46 +
.../blas/functor/include/hawaii_sgemmSplitKernel.h | 46 +
src/library/blas/functor/include/tahiti.h | 41 +
src/library/blas/functor/tahiti.cc | 120 +
src/library/blas/generic/binary_lookup.cc | 685 +++
src/library/blas/generic/common.c | 25 +-
src/library/blas/generic/common2.cc | 98 +
src/library/blas/generic/functor_cache.cc | 80 +
src/library/blas/generic/solution_seq_make.c | 4 +-
src/library/blas/gens/blas_kgen.h | 3 -
src/library/blas/gens/blas_subgroup.c | 6 +-
src/library/blas/gens/clTemplates/dgemm_NT_MN48.cl | 347 ++
.../gens/clTemplates/dgemm_gcn_SmallMatrices.cl | 1159 ++++
src/library/blas/gens/clTemplates/dgemm_hawai.cl | 6371 ++++++++++++++++++++
.../clTemplates/dgemm_hawaiiChannelConfilct.cl | 152 +
.../gens/clTemplates/dgemm_hawaiiSplitKernel.cl | 5043 ++++++++++++++++
src/library/blas/gens/clTemplates/dtrsm_gpu.cl | 2004 ++++++
src/library/blas/gens/clTemplates/dtrsm_gpu192.cl | 1031 ++++
src/library/blas/gens/clTemplates/sgemm_gcn.cl | 2083 +++++++
.../gens/clTemplates/sgemm_gcn_SmallMatrices.cl | 1036 ++++
.../gens/clTemplates/sgemm_hawaiiSplit64_32.cl | 530 ++
.../gens/clTemplates/sgemm_hawaiiSplitKernel.cl | 6179 +++++++++++++++++++
src/library/blas/gens/clTemplates/zgemm_gcn.cl | 319 +
src/library/blas/include/clblas-internal.h | 28 +
src/library/blas/init.c | 12 +
src/library/blas/matrix.c | 979 +++
src/library/blas/xgemm.c | 783 ---
src/library/blas/xgemm.cc | 328 +
src/library/blas/xscal.cc | 340 ++
src/library/blas/xtrsm.c | 249 -
src/library/blas/xtrsm.cc | 333 +
src/library/common/devinfo.c | 6 +
src/library/common/md5sum.c | 378 ++
src/library/common/rwlock.c | 172 +
.../tools/{tplgen => bingen}/CMakeLists.txt | 17 +-
src/library/tools/bingen/bingen.cpp | 512 ++
src/library/tools/ktest/CMakeLists.txt | 34 +-
src/library/tools/tplgen/tplgen.cpp | 85 +-
src/library/tools/tune/CMakeLists.txt | 33 +-
src/library/tools/tune/tune.c | 5 +-
src/samples/CMakeLists.txt | 21 +-
src/samples/example_csscal.c | 3 +-
src/scripts/perf/CMakeLists.txt | 6 +-
src/scripts/perf/blasPerformanceTesting.py | 4 +-
src/tests/CMakeLists.txt | 28 +-
src/tests/correctness/test-correctness.cpp | 3 +-
src/tests/performance/test-performance.cpp | 5 +-
144 files changed, 48949 insertions(+), 1308 deletions(-)
create mode 100644 debian/patches/fix-docs-output.patch
delete mode 100644 debian/patches/fix-doxygen-settings.patch
create mode 100644 debian/patches/fix-pthread-linkage.patch
create mode 100644 debian/patches/reproducible-build.patch
create mode 100644 debian/patches/use-system-mathjax.patch
create mode 100644 doc/README-BinaryCacheOnDisk.txt
create mode 100644 doc/README-FunctorConcepts.txt
create mode 100644 doc/README-HowToIntroduceFunctors.txt
create mode 100644 doc/README-TransformASolverIntoAFunctor.txt
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/README.txt
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/dgemm_32.csv
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/dgemm_96.csv
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/dtrsm_192.csv
create mode 100755 doc/performance/clBLAS_2.6.0/S9150/generate_graphs.sh
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/peak_dp.csv
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/peak_sp.csv
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/sgemm_32.csv
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/zgemm_32.csv
create mode 100644 doc/performance/clBLAS_2.6.0/S9150/zgemm_64.csv
create mode 100644 doc/performance/cuBLAS_7.0/Tesla_K40/README.txt
create mode 100644 doc/performance/cuBLAS_7.0/Tesla_K40/dgemm.csv
create mode 100644 doc/performance/cuBLAS_7.0/Tesla_K40/dtrsm.csv
create mode 100644 doc/performance/cuBLAS_7.0/Tesla_K40/peak_dp.csv
create mode 100644 doc/performance/cuBLAS_7.0/Tesla_K40/peak_sp.csv
create mode 100644 doc/performance/cuBLAS_7.0/Tesla_K40/sgemm.csv
create mode 100644 doc/performance/cuBLAS_7.0/Tesla_K40/zgemm.csv
create mode 100644 src/flags_public.txt
create mode 100644 src/include/binary_lookup.h
create mode 100644 src/include/md5sum.h
create mode 100644 src/include/rwlock.h
create mode 100644 src/library/bingen.cmake
create mode 100644 src/library/blas/fill.cc
create mode 100644 src/library/blas/functor/bonaire.cc
create mode 100644 src/library/blas/functor/functor.cc
create mode 100644 src/library/blas/functor/functor_fill.cc
create mode 100644 src/library/blas/functor/functor_selector.cc
create mode 100644 src/library/blas/functor/functor_xgemm.cc
create mode 100644 src/library/blas/functor/functor_xscal.cc
create mode 100644 src/library/blas/functor/functor_xscal_generic.cc
create mode 100644 src/library/blas/functor/functor_xtrsm.cc
create mode 100644 src/library/blas/functor/gcn_dgemm.cc
create mode 100644 src/library/blas/functor/gcn_dgemmCommon.cc
create mode 100644 src/library/blas/functor/gcn_dgemmSmallMatrices.cc
create mode 100644 src/library/blas/functor/gcn_sgemm.cc
create mode 100644 src/library/blas/functor/gcn_sgemmSmallMatrices.cc
create mode 100644 src/library/blas/functor/gcn_zgemm.cc
create mode 100644 src/library/blas/functor/gpu_dtrsm.cc
create mode 100644 src/library/blas/functor/gpu_dtrsm192.cc
create mode 100644 src/library/blas/functor/hawaii.cc
create mode 100644 src/library/blas/functor/hawaii_dgemmChannelConflict.cc
create mode 100644 src/library/blas/functor/hawaii_dgemmSplitKernel.cc
create mode 100644 src/library/blas/functor/hawaii_sgemmBranchKernel.cc
create mode 100644 src/library/blas/functor/hawaii_sgemmSplit64_32.cc
create mode 100644 src/library/blas/functor/hawaii_sgemmSplitKernel.cc
create mode 100644 src/library/blas/functor/include/BinaryBuild.h
create mode 100644 src/library/blas/functor/include/atomic_counter.h
create mode 100644 src/library/blas/functor/include/bonaire.h
create mode 100644 src/library/blas/functor/include/functor.h
create mode 100644 src/library/blas/functor/include/functor_fill.h
create mode 100644 src/library/blas/functor/include/functor_hawaii_dgemm_NT_MN48.h
create mode 100644 src/library/blas/functor/include/functor_selector.h
create mode 100644 src/library/blas/functor/include/functor_utils.h
create mode 100644 src/library/blas/functor/include/functor_xgemm.h
create mode 100644 src/library/blas/functor/include/functor_xscal.h
create mode 100644 src/library/blas/functor/include/functor_xscal_generic.h
create mode 100644 src/library/blas/functor/include/functor_xtrsm.h
create mode 100644 src/library/blas/functor/include/gcn_dgemm.h
create mode 100644 src/library/blas/functor/include/gcn_dgemmCommon.h
create mode 100644 src/library/blas/functor/include/gcn_dgemmSmallMatrices.h
create mode 100644 src/library/blas/functor/include/gcn_sgemm.h
create mode 100644 src/library/blas/functor/include/gcn_sgemmSmallMatrices.h
create mode 100644 src/library/blas/functor/include/gcn_zgemm.h
create mode 100644 src/library/blas/functor/include/gpu_dtrsm.h
create mode 100644 src/library/blas/functor/include/gpu_dtrsm192.h
create mode 100644 src/library/blas/functor/include/hawaii.h
create mode 100644 src/library/blas/functor/include/hawaii_dgemmChannelConflict.h
create mode 100644 src/library/blas/functor/include/hawaii_dgemmSplitKernel.h
create mode 100644 src/library/blas/functor/include/hawaii_sgemmBranchKernel.h
create mode 100644 src/library/blas/functor/include/hawaii_sgemmSplit64_32.h
create mode 100644 src/library/blas/functor/include/hawaii_sgemmSplitKernel.h
create mode 100644 src/library/blas/functor/include/tahiti.h
create mode 100644 src/library/blas/functor/tahiti.cc
create mode 100644 src/library/blas/generic/binary_lookup.cc
create mode 100644 src/library/blas/generic/common2.cc
create mode 100644 src/library/blas/generic/functor_cache.cc
create mode 100644 src/library/blas/gens/clTemplates/dgemm_NT_MN48.cl
create mode 100644 src/library/blas/gens/clTemplates/dgemm_gcn_SmallMatrices.cl
create mode 100644 src/library/blas/gens/clTemplates/dgemm_hawai.cl
create mode 100644 src/library/blas/gens/clTemplates/dgemm_hawaiiChannelConfilct.cl
create mode 100644 src/library/blas/gens/clTemplates/dgemm_hawaiiSplitKernel.cl
create mode 100644 src/library/blas/gens/clTemplates/dtrsm_gpu.cl
create mode 100644 src/library/blas/gens/clTemplates/dtrsm_gpu192.cl
create mode 100644 src/library/blas/gens/clTemplates/sgemm_gcn.cl
create mode 100644 src/library/blas/gens/clTemplates/sgemm_gcn_SmallMatrices.cl
create mode 100644 src/library/blas/gens/clTemplates/sgemm_hawaiiSplit64_32.cl
create mode 100644 src/library/blas/gens/clTemplates/sgemm_hawaiiSplitKernel.cl
create mode 100644 src/library/blas/gens/clTemplates/zgemm_gcn.cl
create mode 100644 src/library/blas/matrix.c
delete mode 100644 src/library/blas/xgemm.c
create mode 100644 src/library/blas/xgemm.cc
create mode 100644 src/library/blas/xscal.cc
delete mode 100644 src/library/blas/xtrsm.c
create mode 100644 src/library/blas/xtrsm.cc
create mode 100644 src/library/common/md5sum.c
create mode 100644 src/library/common/rwlock.c
copy src/library/tools/{tplgen => bingen}/CMakeLists.txt (61%)
create mode 100644 src/library/tools/bingen/bingen.cpp
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/clblas.git
More information about the debian-science-commits
mailing list