Best way to pass 3D vector fields to the kernel
Hello, What's the best / most efficient way to pass three dimensional vector fields to the kernel? My current solution is to pass the data (float or double arrays) as a 1D array in this order: X_000...
View ArticleAMD OpenCL - compiler segmentation fault
Hello All, I have recently been testing my OpenCL code on an AMD HD7970 GPU, and some of my kernels are causing the compiler to crash with a segmentation fault at clBuildProgram() . I would like to...
View ArticleOpenGL / OpenCL interop with shared contexes and multithreading
Hi,I am working on a project using OpenCL / OpenGL interoperability and multi-threading. Thread1 is used just for rendering of VBO and Thread2 is used for running OpenCL kernel which process geometry...
View ArticleAny improvements in compute-transfer overlap in APP SDK 2.8?
In APP SDK 2.6 async copy preview was implemented (via setting GPU_ASYNC_MEM_COPY to 2), but I failed to make it work (no clear examples and documentation on the feature; I'm probably doing something...
View ArticleOpenCL compiler crashes, example with 2 line kernel
I have problems compiling my some of my kernels for AMD GPUs, no problems with NVDIA OpenCL. Used platform OpenCL 1.2 AMD-APP (1084.4) Juniper HD5700 Win7-64. Minimal kernel extracted from bigger one...
View ArticleError in Bubble Sort Program
I've created a bubble sort code. The createProgram function is giving an error which I've attached as a png photo and also attached the host code.My kernel looks like: __kernel void...
View ArticleUser needs help with OpenCL w/13.1 driver on Windows XP
Greetings: I've been using OpenCL for running Boinc (think Seti-at-Home) on my PC for years. I've got a Radeon HD5750 in a Windows XP system, the sole video card. Recently, I was required to upgrade...
View Articleaccess violation aticaldd.dll
Hello all, I would like to ask for a help with access violation while running application. Building the OpenCL code in function:cl_int status = clBuildProgram(program, devicesCount, devices, options,...
View ArticlePassing structs as kernel arguments crashed using APP SDK 2.8
Hi, My kernel has many arguments which are grouped as two structs for convenience. Passing single struct as argument to kernel is OK, but passing two of them caused crash using APP SDK 2.8. It works...
View Articleioctl permissions on Linux
I'm running OpenCL AMD-APP-SDK-v2.8-lnx64.tgz with FirePro_9.003.3-Linux-x32x64-151130.zip on 4 FirePro 3D V5800 Device 0 : Juniper Device ID is 0x1bb83f0Device 1 : Juniper Device ID is 0x1c88580Device...
View ArticleMultiple contexts parallel allocating or writing to memory of a single device
Hello, I have a program which uses openmp to schedule work in parallel to one opencl device i.e a gpu. This is done right now by using multiple contexts and which have there own unique queues and...
View ArticleclAmdBlasDgemv -- InvalidLeadDimA
Hello, I'm having some troubles using the BLAS API from the clAmdBlas library. I have a simple matrix-vector multiplication that I want to process using the following call: rc =...
View ArticleHow to do parallel reduction correctly?
Hi,i am trying to write some kernels, which use local memory and parallel reduction. With this I have some problems.My kernel tries to evaluate the min value and the max value of an array....
View ArticleProblem with memory in OpenCL 1.2
Hello, everyone!I have i great problem with memory on output from kernel.I use OpenCL 1.2 for parallel programming on CPU.As input I have an OpenCL buffer with sctructures ENNInput. Every ENNInput...
View Articlefglrx fails to load on false-positive switchable graphics TOSHIBA laptop
Hi! I decided to write to this forum because I have an issue that is driving me crazy. Only two months ago I bought a new TOSHIBA laptop, a Satellite L850-1PD PSKG8E-U05500UIT...
View Articleread_image performance
Hi there, Why does the read_imageui API always translated into 2 calls of sample_id()_sampler() with conditional branch? For example, this simple code=========__private int2 coords;__read_only...
View Articlearbitrary size matrix multiplication
Does anyone know a fast arbitrary size matrix multiplication algorithm/code on GPU? The matrix multiplication from SDK seems only work when input matrix has a size of multiple of 16. For example, if...
View ArticleSmall temporary arrays in OpenCL
Hi, Does OpenCL take advantage of the following techniques when using small local arrays?- On VLIW -> indexed_temp_arrays (x0[n]) (aka. R55[A0.x] indirect register addressing in ISA)- On GCN ->...
View ArticleOpenGL / OpenCL interop with shared contexes and multithreading
Hi,I am working on a project using OpenCL / OpenGL interoperability and multi-threading. Thread1 is used just for rendering of VBO and Thread2 is used for running OpenCL kernel which process geometry...
View ArticleNo access to second GPGPU without monitor (need a dummy plug)??
In my first post I asked about mixed vendor Linux OpenCl. Not just here, but elsewhere. The resolute silence suggested that what I was trying to do was too cutting edge. Maybe someone knows how to do...
View Article