Hello,
I have encountered something what I believe is a bug in AMD's OpenCL implementation of clCreateBuffer using CL_MEM_COPY_HOST_PTR flag.
Here is a minimal code that makes the AMD OpenCL runtime crash:
#include <CL/cl.hpp>
#include <ppl.h>
cl::CommandQueue queue = cl::CommandQueue::getDefault();
void run() {
cl_int value = 0;
cl::Buffer buff(CL_MEM_COPY_HOST_PTR, sizeof(cl_int), &value);
queue.enqueueReadBuffer(buff, CL_TRUE, 0, sizeof(cl_int), &value);
};
int main() {
for(int i = 0; i < 10000; ++i) {
concurrency::parallel_invoke(&run, &run, &run, &run, &run, &run, &run, &run, &run, &run);
queue.finish();
}
}
Compile and run, the program crashes with various errors like access violation or trying to double-free memory.
Instead of PPL, one can use TBB's parallel_invoke with the same effect.
I have a Radeon R9 290 + Opteron 6234 machine with Windows 8.1 64 bit and Microsoft Visual Studio 2013 Update 3 (same with Update 4 what's just been released; but I think the compiler version is irrelevant). I currently use the 14.11.1 beta driver, but have tried a range of different driver including (if I remember correctly) 14.4 WHQL, 14.9 WHQL, 14.9.1 beta, 14.9.2 beta.
Removing the CL_MEM_COPY_HOST_PTR flag (and using enqueueFillBuffer or enqueueWriteBuffer to initialize) makes this problem go away. Changing the main loop to call the 'run' function sequentially also makes crashes disappear.
I have tested the same code with Intel's runtime using a Core i7-3667U laptop, where the code executes properly using the HD 4000.
Workaround is actually easy, use clEnqueueFill/WriteBuffer instead, but the error is insanely hard to detect: In original code base where I encountered this problem there are no compiler warnings (with /W3) and no AMD, Intel or Microsoft static or dynamic analysis tool found any warning, and the crash happens in a random position inside amdocl.dll, and because the lack of a PDB file, even the stack trace is wrong/useless.
I would like to ask to:
a, Fix the crash
b, Provide a warning by CodeXL's analyzer/debugger (or any relevant tool) for using CL_MEM_COPY_HOST_PTR if it is not considered best practice.