[clfft] 05/128: Pre-callback - Readme for client and minor update in client code
Ghislain Vaillant
ghisvail-guest at moszumanska.debian.org
Thu Oct 22 14:54:32 UTC 2015
This is an automated email from the git hooks/post-receive script.
ghisvail-guest pushed a commit to branch master
in repository clfft.
commit a45ed369c66482241b10da165cc57d777ac0ddaf
Author: Pradeep <pradeep.rao at amd.com>
Date: Fri Jul 24 13:46:22 2015 +0530
Pre-callback - Readme for client and minor update in client code
---
src/client-callback/README.md | 117 ++++++++++++++++++++++++++++++++
src/client-callback/callback-client.cpp | 4 +-
2 files changed, 119 insertions(+), 2 deletions(-)
diff --git a/src/client-callback/README.md b/src/client-callback/README.md
new file mode 100644
index 0000000..21b3623
--- /dev/null
+++ b/src/client-callback/README.md
@@ -0,0 +1,117 @@
+clFFT - Callback Client
+=======================
+
+
+clFFT Callback client is a sample application demonstrating the use of
+callback feature of clFFT.
+
+Callback feature provides ability to do custom processing when reading
+input data or when writing output data. There are 2 types of callback,
+Pre-callback and Post-callback. Pre-callback invokes user callback
+function to do custom preprocessing of input data before FFT is executed.
+Post-callback invokes user callback function to do custom post-processing
+of output data after FFT is executed. The intent is to avoid additional
+kernels and kernel launches to carry out the pre/post processing. Instead
+the pre/post processing logic can be included in an inline opencl function
+(one each for pre and post) and passed as a string to library which would
+then be incorporated into the generated FFT kernel.
+
+The block below shows the help message given by the callback client
+listing all the command line options.
+
+```c
+C:\clFFT\src\build\staging\Debug>clFFT-callback.exe -h
+clFFT client command line options:
+ -h [ --help ] produces this help message
+ -g [ --gpu ] Force selection of OpenCL GPU devices only
+ -c [ --cpu ] Force selection of OpenCL CPU devices only
+ -a [ --all ] Force selection of all OpenCL devices (default)
+ -o [ --outPlace ] Out of place FFT transform (default: in place)
+ --double Double precision transform (default: single)
+ --inv Backward transform (default: forward)
+ -d [ --dumpKernels ] FFT engine will dump generated OpenCL FFT kernels
+ to disk (default: dump off)
+ --noprecall Disable Precallback (default: precallback on)
+ -x [ --lenX ] arg (=1024) Specify the length of the 1st dimension of a test
+ array
+ -y [ --lenY ] arg (=1) Specify the length of the 2nd dimension of a test
+ array
+ -z [ --lenZ ] arg (=1) Specify the length of the 3rd dimension of a test
+ array
+ --isX arg (=1) Specify the input stride of the 1st dimension of
+ a test array
+ --isY arg (=0) Specify the input stride of the 2nd dimension of
+ a test array
+ --isZ arg (=0) Specify the input stride of the 3rd dimension of
+ a test array
+ --iD arg (=0) input distance between subsequent sets of data
+ when batch size > 1
+ --osX arg (=1) Specify the output stride of the 1st dimension of
+ a test array
+ --osY arg (=0) Specify the output stride of the 2nd dimension of
+ a test array
+ --osZ arg (=0) Specify the output stride of the 3rd dimension of
+ a test array
+ --oD arg (=0) output distance between subsequent sets of data
+ when batch size > 1
+ -b [ --batchSize ] arg (=1) If this value is greater than one, arrays will be
+ used
+ -p [ --profile ] arg (=1) Time and report the kernel speed of the FFT
+ (default: profiling off)
+ --inLayout arg (=1) Layout of input data:
+ 1) interleaved
+ 2) planar
+ 3) hermitian interleaved
+ 4) hermitian planar
+ 5) real
+ --outLayout arg (=1) Layout of input data:
+ 1) interleaved
+ 2) planar
+ 3) hermitian interleaved
+ 4) hermitian planar
+ 5) real
+
+```
+"--noprecall" option can be used to disable Pre-callback (default: precallback on)
+
+## What's New
+
+Callback client in the develop branch demonstrates use of pre-callback
+for Single Precision Complex-Complex 1D transforms for lengths upto 4096. Output data
+is verified against fftw library.
+
+## Example
+
+Some examples are shown below.
+
+1D Complex-Complex Interleaved transform with pre-callback for length 1024
+```c
+C:\clFFT\src\build\staging\Debug>clFFT-callback.exe -x 1024 --inLayout 1 --outLayout 1
+
+
+ Internal Client Test *****PASS*****
+```
+
+1D Complex-Complex Planar transform with pre-callback for length 1024
+```c
+C:\clFFT\src\build\staging\Debug>clFFT-callback.exe -x 1024 --inLayout 2 --outLayout 2
+
+
+ Internal Client Test *****PASS*****
+```
+
+1D Complex-Complex Interleaved transform with pre-callback for length 1024 and batch size of 2
+```c
+C:\Users\prangana\Documents\GitHub\pradeeptrgit\clFFT\src\build\staging\Debug>clFFT-callback.exe -x 1024 --inLayout 1 --outLayout 1 -b 2
+
+
+ Internal Client Test *****PASS*****
+```
+
+1D Complex-Complex Interleaved transform without pre-callback for length 1024
+```c
+C:\clFFT\src\build\staging\Debug>clFFT-callback.exe -x 1024 --inLayout 1 --outLayout 1 --noprecall
+
+
+ Internal Client Test *****PASS*****
+```
\ No newline at end of file
diff --git a/src/client-callback/callback-client.cpp b/src/client-callback/callback-client.cpp
index 5b939a0..a1fd832 100644
--- a/src/client-callback/callback-client.cpp
+++ b/src/client-callback/callback-client.cpp
@@ -16,14 +16,14 @@ namespace po = boost::program_options;
#define SCALAR 100
#define PRECALLBACKTYPE 1
-#define MULVAL float2 mulval(__global void* in, int offset, __global void* userdata)\n \
+#define MULVAL float2 mulval(__global void* in, uint offset, __global void* userdata)\n \
{ \n \
int scalar = *((__global int*)userdata + offset); \n \
float2 ret = *((__global float2*)in + offset) * scalar; \n \
return ret; \n \
}
-#define MULVAL_PLANAR float2 mulval(__global void* inRe, __global void* inIm, int offset, __global void* userdata)\n \
+#define MULVAL_PLANAR float2 mulval(__global void* inRe, __global void* inIm, uint offset, __global void* userdata)\n \
{ \n \
__global USER_DATA *data = ((__global USER_DATA *)userdata + offset); \n \
int scalar = (int)data->scalar1 + (int)data->scalar2 + (int)data->scalar3; \n \
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/clfft.git
More information about the debian-science-commits
mailing list