I have a kernel that does something similar to a map/reduce, and when running on the GPU, it works completely correctly. However, when I switch the device_type in clGetDeviceIDs to CL_DEVICE_TYPE_CPU, the application crashes with a segmentation fault during runtime. No other code is changed.
Valgrind log with memory tracking attached. Also available here: http://pastebin.com/x9yi56Sr
$ uname -a Linux machine 3.14.2-1-ARCH #1 SMP PREEMPT Sun Apr 27 11:28:44 CEST 2014 x86_64 GNU/Linux $ pacman -Qi opencl-headers | grep Version Version : 2:1.1.20110526-1 $ sha1sum /usr/lib/libamdocl64.so 8478fb06f0557be9fcd16cd1728e5a199a0e4fee /usr/lib/libamdocl64.so $ pacman -Qo /usr/lib/libamdocl64.so /usr/lib/libamdocl64.so is owned by catalyst-test 14.4-11 $ lspci | grep -i VGA 05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Turks GL [FirePro V4900] $ lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 24 On-line CPU(s) list: 0-23 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 45 Model name: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz Stepping: 7 CPU MHz: 1244.875 CPU max MHz: 2800.0000 CPU min MHz: 1200.0000 BogoMIPS: 4591.39 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 15360K NUMA node0 CPU(s): 0-5,12-17 NUMA node1 CPU(s): 6-11,18-23