[arrayfire] 213/284: Merge remote-tracking branch 'upstream/devel' into fallback-opts
Ghislain Vaillant
ghisvail-guest at moszumanska.debian.org
Sun Feb 7 18:59:34 UTC 2016
This is an automated email from the git hooks/post-receive script.
ghisvail-guest pushed a commit to branch debian/experimental
in repository arrayfire.
commit b39b60ddac37af8d220312d748b678c1bb32119e
Merge: 56f9140 3009e8f
Author: Shehzan Mohammed <shehzan at arrayfire.com>
Date: Tue Jan 12 11:28:54 2016 -0500
Merge remote-tracking branch 'upstream/devel' into fallback-opts
Conflicts:
docs/pages/configuring_arrayfire_environment.md
src/backend/opencl/memory.cpp
src/backend/opencl/platform.cpp
CMakeModules/build_boost_compute.cmake | 9 +-
CMakeModules/build_clFFT.cmake | 2 +-
docs/pages/configuring_arrayfire_environment.md | 45 +-
include/af/array.h | 4 +-
include/af/device.h | 10 +
src/api/c/assign.cpp | 4 +-
src/api/c/device.cpp | 23 +-
src/api/c/flip.cpp | 2 +-
src/api/c/image.cpp | 2 +-
src/api/c/imageio.cpp | 31 +-
src/api/c/imageio2.cpp | 10 +-
src/api/c/index.cpp | 6 +-
src/api/c/stream.cpp | 22 +-
src/backend/MemoryManager.cpp | 257 ++++++++++++
src/backend/MemoryManager.hpp | 99 +++++
src/backend/cpu/Array.hpp | 2 +-
src/backend/cpu/memory.cpp | 270 ++++--------
src/backend/cpu/memory.hpp | 7 +-
src/backend/cpu/platform.cpp | 6 +-
src/backend/cuda/Array.hpp | 2 +-
src/backend/cuda/CMakeLists.txt | 1 -
src/backend/cuda/memory.cpp | 528 +++++++-----------------
src/backend/cuda/memory.hpp | 7 +-
src/backend/cuda/platform.cpp | 28 +-
src/{api/c => backend}/dispatch.cpp | 0
src/{api/c => backend}/dispatch.hpp | 0
src/backend/opencl/Array.hpp | 2 +-
src/backend/opencl/memory.cpp | 516 ++++++++---------------
src/backend/opencl/memory.hpp | 11 +-
src/backend/opencl/platform.cpp | 274 +++++++++---
30 files changed, 1113 insertions(+), 1067 deletions(-)
diff --cc docs/pages/configuring_arrayfire_environment.md
index 3de8fbe,37327ac..a9ec486
--- a/docs/pages/configuring_arrayfire_environment.md
+++ b/docs/pages/configuring_arrayfire_environment.md
@@@ -18,16 -18,6 +18,16 @@@ This is the path with ArrayFire gets in
present in this directory. You can use this variable to add include paths and
libraries to your projects.
+AF_PRINT_ERRORS {#af_print_errors}
+-------------------------------------------------------------------------------
+
+When AF_PRINT_ERRORS is set to 1, the exceptions thrown are more verbose and
+detailed. This helps in locating the exact failure.
+
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- AF_PRINT_ERRORS=1 ./myprogram_opencl
++AF_PRINT_ERRORS=1 ./myprogram
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
AF_CUDA_DEFAULT_DEVICE {#af_cuda_default_device}
-------------------------------------------------------------------------------
@@@ -54,24 -44,37 +54,55 @@@ AF_OPENCL_DEFAULT_DEVICE=1 ./myprogram_
Note: af::setDevice call in the source code will take precedence over this
variable.
+ AF_OPENCL_DEFAULT_DEVICE_TYPE {#af_opencl_default_device_type}
+ -------------------------------------------------------------------------------
+
+ Use this variable to set the default OpenCL device type. Valid values for this
+ variable are: CPU, GPU, ACC (Accelerators).
+
+ When set, the first device of the specified type is chosen as default device.
+
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ AF_OPENCL_DEFAULT_DEVICE_TYPE=CPU ./myprogram_opencl
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
-Note: `AF_OPENCL_DEFAULT_DEVICE` and af::setDevice takes precedence over this variable.
++Note: `AF_OPENCL_DEFAULT_DEVICE` and af::setDevice takes precedence over this variable.
+
+ AF_OPENCL_DEVICE_TYPE {#af_opencl_device_type}
+ -------------------------------------------------------------------------------
+
+ Use this variable to only choose OpenCL devices of specified type. Valid values for this
+ variable are:
+
+ - ALL: All OpenCL devices. (Default behavior).
+ - CPU: CPU devices only.
+ - GPU: GPU devices only.
+ - ACC: Accelerator devices only.
+
+ When set, the remaining OpenCL device types are ignored by the OpenCL backend.
+
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ AF_OPENCL_DEVICE_TYPE=CPU ./myprogram_opencl
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+AF_OPENCL_CPU_OFFLOAD {#af_opencl_cpu_offload}
+-------------------------------------------------------------------------------
+
+When this variable is set to 1, and the selected OpenCL device has unified
+memory with the host (ie. `CL_DEVICE_HOST_UNIFIED_MEMORY` is true for device),
+then certain functions are offloaded to run on the CPU using mapped buffers.
+
+This takes advantage of fast libraries such as MKL while spending no time
+copying memory from device to host. The device memory is mapped to a host
+pointer which can be used in the offloaded functions.
+
+AF_OPENCL_SHOW_BUILD_INFO {#af_opencl_show_build_info}
+-------------------------------------------------------------------------------
+
+This variable is useful when debuggin OpenCL kernel compilation failures. When
+this variable is set to 1, and an error occurs during a OpenCL kernel
+compilation, then the log and kernel are printed to screen.
+
AF_DISABLE_GRAPHICS {#af_disable_graphics}
-------------------------------------------------------------------------------
@@@ -84,18 -88,24 +115,30 @@@ without displays. When graphics calls a
print warning about window creation failing. To suppress those calls, set this
variable.
-AF_PRINT_ERRORS {#af_print_errors}
+AF_SYNCHRONOUS_CALLS {#af_synchronous_calls}
-------------------------------------------------------------------------------
-When AF_PRINT_ERRORS is set to 1, the exceptions thrown are more verbose and
-detailed. This helps in locating the exact failure.
+When this environment variable is set to 1, ArrayFire will execute all
+functions synchronously.
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-AF_PRINT_ERRORS=1 ./myprogram_opencl
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+AF_SHOW_LOAD_PATH {#af_show_load_path}
+-------------------------------------------------------------------------------
+
+When using the Unified backend, if this variable is set to 1, it will show the
+path where the ArrayFire backend libraries are loaded from.
+
+If the libraries are loaded from system paths, such as PATH or LD_LIBRARY_PATH
+etc, then it will print "system path". If the libraries are loaded from other
+paths, then those paths are shown in full.
+
-AF_MEM_DEBUG (#af_mem_debug)
++AF_MEM_DEBUG {#af_mem_debug}
+ -------------------------------------------------------------------------------
+
+ When AF_MEM_DEBUG is set to 1 (or anything not equal to 0), the caching mechanism in the memory manager.
+ The device buffers are allocated using native functions as needed and freed when going out of scope.
+
+ When the environment variable is not set, it is treated to be non zero.
+
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ AF_MEM_DEBUG=1 ./myprogram
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --cc src/backend/opencl/memory.cpp
index cf3f4cc,8a48a48..2427581
--- a/src/backend/opencl/memory.cpp
+++ b/src/backend/opencl/memory.cpp
@@@ -14,396 -14,234 +14,234 @@@
#include <iomanip>
#include <string>
#include <types.hpp>
-#include "err_opencl.hpp"
+#include <err_opencl.hpp>
- namespace opencl
- {
- static size_t memory_resolution = 1024; //1KB
-
- void setMemStepSize(size_t step_bytes)
- {
- memory_resolution = step_bytes;
- }
+ #include <MemoryManager.hpp>
- size_t getMemStepSize(void)
- {
- return memory_resolution;
- }
-
- // Manager Class
- // Dummy used to call garbage collection at the end of the program
- class Manager
- {
- public:
- static bool initialized;
- Manager()
- {
- initialized = true;
- }
-
- ~Manager()
- {
- for(int i = 0; i < (int)getDeviceCount(); i++) {
- setDevice(i);
- garbageCollect();
- pinnedGarbageCollect();
- }
- }
- };
-
- bool Manager::initialized = false;
-
- static void managerInit()
- {
- if(Manager::initialized == false)
- static Manager pm = Manager();
- }
-
- typedef struct
- {
- bool mngr_lock;
- bool user_lock;
- size_t bytes;
- } mem_info;
+ #ifndef AF_MEM_DEBUG
+ #define AF_MEM_DEBUG 0
+ #endif
- static size_t used_bytes[DeviceManager::MAX_DEVICES] = {0};
- static size_t used_buffers[DeviceManager::MAX_DEVICES] = {0};
- static size_t total_bytes[DeviceManager::MAX_DEVICES] = {0};
+ #ifndef AF_OPENCL_MEM_DEBUG
+ #define AF_OPENCL_MEM_DEBUG 0
+ #endif
- typedef std::map<cl::Buffer *, mem_info> mem_t;
- typedef mem_t::iterator mem_iter;
- mem_t memory_maps[DeviceManager::MAX_DEVICES];
-
- static void destroy(cl::Buffer *ptr)
- {
- delete ptr;
- }
+ namespace opencl
+ {
- void garbageCollect()
+ class MemoryManager : public common::MemoryManager
+ {
+ int getActiveDeviceId();
+ public:
+ MemoryManager();
+ void *nativeAlloc(const size_t bytes);
+ void nativeFree(void *ptr);
+ ~MemoryManager()
{
- int n = getActiveDeviceId();
- for(mem_iter iter = memory_maps[n].begin();
- iter != memory_maps[n].end(); ++iter) {
-
- if (!(iter->second).mngr_lock) {
-
- if (!(iter->second).user_lock) {
- destroy(iter->first);
- total_bytes[n] -= iter->second.bytes;
- }
- }
- }
-
- mem_iter memory_curr = memory_maps[n].begin();
- mem_iter memory_end = memory_maps[n].end();
-
- while(memory_curr != memory_end) {
- if (memory_curr->second.mngr_lock || memory_curr->second.user_lock) {
- ++memory_curr;
- } else {
- memory_maps[n].erase(memory_curr++);
- }
+ common::lock_guard_t lock(this->memory_mutex);
+ for (int n = 0; n < getDeviceCount(); n++) {
+ opencl::setDevice(n);
+ this->garbageCollect();
}
}
+ };
- void printMemInfo(const char *msg, const int device)
- {
- std::cout << msg << std::endl;
- std::cout << "Memory Map for Device: " << device << std::endl;
-
- static const std::string head("| POINTER | SIZE | AF LOCK | USER LOCK |");
- static const std::string line(head.size(), '-');
- std::cout << line << std::endl << head << std::endl << line << std::endl;
-
- for(mem_iter iter = memory_maps[device].begin();
- iter != memory_maps[device].end(); ++iter) {
-
- std::string status_mngr("Unknown");
- std::string status_user("Unknown");
-
- if(iter->second.mngr_lock) status_mngr = "Yes";
- else status_mngr = " No";
+ class MemoryManagerPinned : public common::MemoryManager
+ {
+ std::vector<
+ std::map<void *, cl::Buffer>
+ > pinned_maps;
+ int getActiveDeviceId();
- if(iter->second.user_lock) status_user = "Yes";
- else status_user = " No";
+ public:
- std::string unit = "KB";
- double size = (double)(iter->second.bytes) / 1024;
- if(size >= 1024) {
- size = size / 1024;
- unit = "MB";
- }
+ MemoryManagerPinned();
- std::cout << "| " << std::right << std::setw(14) << iter->first << " "
- << " | " << std::setw(7) << std::setprecision(4) << size << " " << unit
- << " | " << std::setw(9) << status_mngr
- << " | " << std::setw(9) << status_user
- << " |" << std::endl;
- }
-
- std::cout << line << std::endl;
- }
+ void *nativeAlloc(const size_t bytes);
+ void nativeFree(void *ptr);
- cl::Buffer *bufferAlloc(const size_t &bytes)
+ ~MemoryManagerPinned()
{
- int n = getActiveDeviceId();
- cl::Buffer *ptr = NULL;
- size_t alloc_bytes = divup(bytes, memory_resolution) * memory_resolution;
-
- if (bytes > 0) {
-
- // FIXME: Add better checks for garbage collection
- // Perhaps look at total memory available as a metric
- if (memory_maps[n].size() >= MAX_BUFFERS || used_bytes[n] >= MAX_BYTES) {
- garbageCollect();
+ common::lock_guard_t lock(this->memory_mutex);
+ for (int n = 0; n < getDeviceCount(); n++) {
+ opencl::setDevice(n);
+ this->garbageCollect();
+ auto pinned_curr_iter = pinned_maps[n].begin();
+ auto pinned_end_iter = pinned_maps[n].end();
+ while (pinned_curr_iter != pinned_end_iter) {
+ pinned_maps[n].erase(pinned_curr_iter++);
}
-
- for(mem_iter iter = memory_maps[n].begin();
- iter != memory_maps[n].end(); ++iter) {
-
- mem_info info = iter->second;
-
- if (!info.mngr_lock &&
- !info.user_lock &&
- info.bytes == alloc_bytes) {
-
- iter->second.mngr_lock = true;
- used_bytes[n] += alloc_bytes;
- used_buffers[n]++;
- return iter->first;
- }
- }
-
- try {
- ptr = new cl::Buffer(getContext(), CL_MEM_READ_WRITE, alloc_bytes);
- } catch(...) {
- garbageCollect();
- ptr = new cl::Buffer(getContext(), CL_MEM_READ_WRITE, alloc_bytes);
- }
-
- mem_info info = {true, false, alloc_bytes};
- memory_maps[n][ptr] = info;
- used_bytes[n] += alloc_bytes;
- used_buffers[n]++;
- total_bytes[n] += alloc_bytes;
}
- return ptr;
}
+ };
- void bufferFree(cl::Buffer *ptr)
- {
- bufferFreeLocked(ptr, false);
- }
-
- void bufferFreeLocked(cl::Buffer *ptr, bool freeLocked)
- {
- int n = getActiveDeviceId();
- mem_iter iter = memory_maps[n].find(ptr);
-
- if (iter != memory_maps[n].end()) {
-
- iter->second.mngr_lock = false;
- if ((iter->second).user_lock && !freeLocked) return;
+ int MemoryManager::getActiveDeviceId()
+ {
+ return opencl::getActiveDeviceId();
+ }
- iter->second.user_lock = false;
+ MemoryManager::MemoryManager() :
+ common::MemoryManager(getDeviceCount(), MAX_BUFFERS, MAX_BYTES, AF_MEM_DEBUG || AF_OPENCL_MEM_DEBUG)
+ {}
- used_bytes[n] -= iter->second.bytes;
- used_buffers[n]--;
- } else {
- destroy(ptr); // Free it because we are not sure what the size is
- }
+ void *MemoryManager::nativeAlloc(const size_t bytes)
+ {
+ try {
+ return (void *)(new cl::Buffer(getContext(), CL_MEM_READ_WRITE, bytes));
+ } catch(cl::Error err) {
+ CL_TO_AF_ERROR(err);
}
+ }
- void bufferPop(cl::Buffer *ptr)
- {
- int n = getActiveDeviceId();
- mem_iter iter = memory_maps[n].find(ptr);
-
- if (iter != memory_maps[n].end()) {
- iter->second.user_lock = true;
- } else {
-
- mem_info info = { true,
- true,
- 100 }; //This number is not relevant
-
- memory_maps[n][ptr] = info;
- }
+ void MemoryManager::nativeFree(void *ptr)
+ {
+ try {
+ delete (cl::Buffer *)ptr;
+ } catch(cl::Error err) {
+ CL_TO_AF_ERROR(err);
}
+ }
- void bufferPush(cl::Buffer *ptr)
- {
- int n = getActiveDeviceId();
- mem_iter iter = memory_maps[n].find(ptr);
-
- if (iter != memory_maps[n].end()) {
- iter->second.user_lock = false;
- }
- }
+ static MemoryManager &getMemoryManager()
+ {
+ static MemoryManager instance;
+ return instance;
+ }
- void deviceMemoryInfo(size_t *alloc_bytes, size_t *alloc_buffers,
- size_t *lock_bytes, size_t *lock_buffers)
- {
- int n = getActiveDeviceId();
- if (alloc_bytes ) *alloc_bytes = total_bytes[n];
- if (alloc_buffers ) *alloc_buffers = memory_maps[n].size();
- if (lock_bytes ) *lock_bytes = used_bytes[n];
- if (lock_buffers ) *lock_buffers = used_buffers[n];
- }
+ int MemoryManagerPinned::getActiveDeviceId()
+ {
+ return opencl::getActiveDeviceId();
+ }
- template<typename T>
- T *memAlloc(const size_t &elements)
- {
- managerInit();
- return (T *)bufferAlloc(elements * sizeof(T));
- }
+ MemoryManagerPinned::MemoryManagerPinned() :
+ common::MemoryManager(getDeviceCount(), MAX_BUFFERS, MAX_BYTES, AF_MEM_DEBUG || AF_OPENCL_MEM_DEBUG),
+ pinned_maps(getDeviceCount())
+ {}
- template<typename T>
- void memFree(T *ptr)
- {
- return bufferFreeLocked((cl::Buffer *)ptr, false);
+ void *MemoryManagerPinned::nativeAlloc(const size_t bytes)
+ {
+ void *ptr = NULL;
+ try {
+ cl::Buffer buf= cl::Buffer(getContext(), CL_MEM_ALLOC_HOST_PTR, bytes);
+ ptr = getQueue().enqueueMapBuffer(buf, true, CL_MAP_READ | CL_MAP_WRITE, 0, bytes);
+ pinned_maps[opencl::getActiveDeviceId()][ptr] = buf;
+ } catch(cl::Error err) {
+ CL_TO_AF_ERROR(err);
}
+ return ptr;
+ }
- template<typename T>
- void memFreeLocked(T *ptr, bool freeLocked)
- {
- return bufferFreeLocked((cl::Buffer *)ptr, freeLocked);
- }
+ void MemoryManagerPinned::nativeFree(void *ptr)
+ {
+ try {
+ int n = opencl::getActiveDeviceId();
+ auto iter = pinned_maps[n].find(ptr);
- template<typename T>
- void memPop(const T *ptr)
- {
- return bufferPop((cl::Buffer *)ptr);
- }
+ if (iter != pinned_maps[n].end()) {
+ getQueue().enqueueUnmapMemObject(pinned_maps[n][ptr], ptr);
+ pinned_maps[n].erase(iter);
+ }
- template<typename T>
- void memPush(const T *ptr)
- {
- return bufferPush((cl::Buffer *)ptr);
+ } catch(cl::Error err) {
+ CL_TO_AF_ERROR(err);
}
+ }
- // pinned memory manager
- typedef struct {
- cl::Buffer *buf;
- mem_info info;
- } pinned_info;
+ static MemoryManagerPinned &getMemoryManagerPinned()
+ {
+ static MemoryManagerPinned instance;
+ return instance;
+ }
- typedef std::map<void*, pinned_info> pinned_t;
- typedef pinned_t::iterator pinned_iter;
- pinned_t pinned_maps[DeviceManager::MAX_DEVICES];
- static size_t pinned_used_bytes = 0;
+ void setMemStepSize(size_t step_bytes)
+ {
+ getMemoryManager().setMemStepSize(step_bytes);
+ }
- static void pinnedDestroy(cl::Buffer *buf, void *ptr)
- {
- getQueue().enqueueUnmapMemObject(*buf, (void *)ptr);
- destroy(buf);
- }
+ size_t getMemStepSize(void)
+ {
+ return getMemoryManager().getMemStepSize();
+ }
- void pinnedGarbageCollect()
- {
- int n = getActiveDeviceId();
- for(auto &iter : pinned_maps[n]) {
- if (!(iter.second).info.mngr_lock) {
- pinnedDestroy(iter.second.buf, iter.first);
- }
- }
- pinned_iter memory_curr = pinned_maps[n].begin();
- pinned_iter memory_end = pinned_maps[n].end();
+ void garbageCollect()
+ {
+ getMemoryManager().garbageCollect();
+ }
- while(memory_curr != memory_end) {
- if (memory_curr->second.info.mngr_lock) {
- ++memory_curr;
- } else {
- memory_curr = pinned_maps[n].erase(memory_curr);
- }
- }
+ void printMemInfo(const char *msg, const int device)
+ {
+ getMemoryManager().printInfo(msg, device);
+ }
- }
+ template<typename T>
+ T* memAlloc(const size_t &elements)
+ {
+ return (T *)getMemoryManager().alloc(elements * sizeof(T));
+ }
- void *pinnedBufferAlloc(const size_t &bytes)
- {
- void *ptr = NULL;
- int n = getActiveDeviceId();
- // Allocate the higher megabyte. Overhead of creating pinned memory is
- // more so we want more resuable memory.
- size_t alloc_bytes = divup(bytes, 1048576) * 1048576;
-
- if (bytes > 0) {
- cl::Buffer *buf = NULL;
-
- // FIXME: Add better checks for garbage collection
- // Perhaps look at total memory available as a metric
- if (pinned_maps[n].size() >= MAX_BUFFERS || pinned_used_bytes >= MAX_BYTES) {
- pinnedGarbageCollect();
- }
+ cl::Buffer *bufferAlloc(const size_t &bytes)
+ {
+ return (cl::Buffer *)getMemoryManager().alloc(bytes);
+ }
- for(pinned_iter iter = pinned_maps[n].begin();
- iter != pinned_maps[n].end(); ++iter) {
+ template<typename T>
+ void memFree(T *ptr)
+ {
+ return getMemoryManager().unlock((void *)ptr, false);
+ }
- mem_info info = iter->second.info;
- if (!info.mngr_lock && info.bytes == alloc_bytes) {
- iter->second.info.mngr_lock = true;
- pinned_used_bytes += alloc_bytes;
- return iter->first;
- }
- }
+ void bufferFree(cl::Buffer *buf)
+ {
+ return getMemoryManager().unlock((void *)buf, false);
+ }
- try {
- buf = new cl::Buffer(getContext(), CL_MEM_ALLOC_HOST_PTR, alloc_bytes);
+ template<typename T>
+ void memFreeLocked(T *ptr, bool user_unlock)
+ {
+ getMemoryManager().unlock((void *)ptr, user_unlock);
+ }
- ptr = getQueue().enqueueMapBuffer(*buf, true, CL_MAP_READ|CL_MAP_WRITE,
- 0, alloc_bytes);
- } catch(...) {
- pinnedGarbageCollect();
- buf = new cl::Buffer(getContext(), CL_MEM_ALLOC_HOST_PTR, alloc_bytes);
+ template<typename T>
+ void memLock(const T *ptr)
+ {
+ getMemoryManager().userLock((void *)ptr);
+ }
- ptr = getQueue().enqueueMapBuffer(*buf, true, CL_MAP_READ|CL_MAP_WRITE,
- 0, alloc_bytes);
- }
- mem_info info = {true, false, alloc_bytes};
- pinned_info pt = {buf, info};
- pinned_maps[n][ptr] = pt;
- pinned_used_bytes += alloc_bytes;
- }
- return ptr;
- }
+ template<typename T>
+ void memUnlock(const T *ptr)
+ {
+ getMemoryManager().userUnlock((void *)ptr);
+ }
- void pinnedBufferFree(void *ptr)
- {
- int n = getActiveDeviceId();
- pinned_iter iter = pinned_maps[n].find(ptr);
- if (iter != pinned_maps[n].end()) {
- iter->second.info.mngr_lock = false;
- pinned_used_bytes -= iter->second.info.bytes;
- } else {
- pinnedDestroy(iter->second.buf, ptr); // Free it because we are not sure what the size is
- pinned_maps[n].erase(iter);
- }
- }
+ void deviceMemoryInfo(size_t *alloc_bytes, size_t *alloc_buffers,
+ size_t *lock_bytes, size_t *lock_buffers)
+ {
+ getMemoryManager().bufferInfo(alloc_bytes, alloc_buffers,
+ lock_bytes, lock_buffers);
+ }
- template<typename T>
- T* pinnedAlloc(const size_t &elements)
- {
- managerInit();
- return (T *)pinnedBufferAlloc(elements * sizeof(T));
- }
+ template<typename T>
+ T* pinnedAlloc(const size_t &elements)
+ {
+ return (T *)getMemoryManagerPinned().alloc(elements * sizeof(T));
+ }
- template<typename T>
- void pinnedFree(T* ptr)
- {
- return pinnedBufferFree((void *) ptr);
- }
+ template<typename T>
+ void pinnedFree(T* ptr)
+ {
+ return getMemoryManagerPinned().unlock((void *)ptr, false);
+ }
- #define INSTANTIATE(T) \
- template T* memAlloc(const size_t &elements); \
- template void memFree(T* ptr); \
- template void memFreeLocked(T* ptr, bool freeLocked); \
- template void memPop(const T* ptr); \
- template void memPush(const T* ptr); \
- template T* pinnedAlloc(const size_t &elements); \
- template void pinnedFree(T* ptr); \
+ #define INSTANTIATE(T) \
+ template T* memAlloc(const size_t &elements); \
+ template void memFree(T* ptr); \
+ template void memFreeLocked(T* ptr, bool user_unlock); \
+ template void memLock(const T* ptr); \
+ template void memUnlock(const T* ptr); \
+ template T* pinnedAlloc(const size_t &elements); \
+ template void pinnedFree(T* ptr); \
INSTANTIATE(float)
INSTANTIATE(cfloat)
diff --cc src/backend/opencl/platform.cpp
index 005f2c1,822fdfc..ef9f8f6
--- a/src/backend/opencl/platform.cpp
+++ b/src/backend/opencl/platform.cpp
@@@ -228,42 -368,37 +368,40 @@@ std::string getInfo(
<< " (OpenCL, " << get_system() << ", build " << AF_REVISION << ")" << std::endl;
unsigned nDevices = 0;
- for (auto context : DeviceManager::getInstance().mContexts) {
- vector<Device> devices = context->getInfo<CL_CONTEXT_DEVICES>();
+ for(auto &device: DeviceManager::getInstance().mDevices) {
+ const Platform platform(device->getInfo<CL_DEVICE_PLATFORM>());
+
+ string dstr = device->getInfo<CL_DEVICE_NAME>();
- for(auto &device:devices) {
- const Platform platform(device.getInfo<CL_DEVICE_PLATFORM>());
+ // Remove null termination character from the strings
+ dstr.pop_back();
- string platStr = platform.getInfo<CL_PLATFORM_NAME>();
- string dstr = device.getInfo<CL_DEVICE_NAME>();
+ bool show_braces = ((unsigned)getActiveDeviceId() == nDevices);
- // Remove null termination character from the strings
- platStr.pop_back();
- dstr.pop_back();
+ string id =
+ (show_braces ? string("[") : "-") +
+ std::to_string(nDevices) +
+ (show_braces ? string("]") : "-");
- bool show_braces = ((unsigned)getActiveDeviceId() == nDevices);
- string id = (show_braces ? string("[") : "-") + std::to_string(nDevices) +
- (show_braces ? string("]") : "-");
- info << id << " " << platformMap(platStr) << ": " << ltrim(dstr) << " ";
+ info << id << " " << getPlatformName(*device) << ": " << ltrim(dstr);
#ifndef NDEBUG
- string devVersion = device.getInfo<CL_DEVICE_VERSION>();
- string driVersion = device.getInfo<CL_DRIVER_VERSION>();
- devVersion.pop_back();
- driVersion.pop_back();
- info << devVersion;
- info << " Device driver " << driVersion;
- info << " FP64 Support("
- << (device.getInfo<CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE>()>0 ? "True" : "False")
- << ")";
+ info << " -- ";
+ string devVersion = device->getInfo<CL_DEVICE_VERSION>();
+ string driVersion = device->getInfo<CL_DRIVER_VERSION>();
+ devVersion.pop_back();
+ driVersion.pop_back();
+ info << devVersion;
+ info << " -- Device driver " << driVersion;
+ info << " -- FP64 Support: "
+ << (device->getInfo<CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE>()>0 ? "True" : "False")
+ << "";
++ info << "Unified Memory("
++ << (isHostUnifiedMemory(*device) ? "True" : "False")
++ << ")";
#endif
- // TODO Move this inside debug
- info << "Unified Memory("
- << (isHostUnifiedMemory(device) ? "True" : "False")
- << ")";
- info << std::endl;
+ info << std::endl;
- nDevices++;
- }
+ nDevices++;
}
return info.str();
}
--
Alioth's /usr/local/bin/git-commit-notice on /srv/git.debian.org/git/debian-science/packages/arrayfire.git
More information about the debian-science-commits
mailing list