Quantcast
Channel: Community : All Content - OpenCL
Viewing all articles
Browse latest Browse all 2400

Bug of Catalyst drivers of version 12.10 and later running with Radeon HD 7000 series cards

$
0
0

An intricate bug has been found in the latest Catalyst drivers when running OpenCL kernels with constant indexes accessing the __local memory.

Conditions: The bug only emerges in each group when get_local_id(0) == 0 and get_local_id(1) == 0 or 14 in a (16, 16, 1) dimension grid.

Example:

__local float s_srcPatch[10][10];

__local float s_dstPatch[20][16];

float sum;

sum =       (evenFlag * 0.0625f) * s_srcPatch[0][1 + ((tidx - 2) / 2)];

sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[0][1 + ((tidx - 1) / 2)];

sum = sum + (evenFlag * 0.375f ) * s_srcPatch[0][1 + ((tidx    ) / 2)];

sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[0][1 + ((tidx + 1) / 2)];

sum = sum + (evenFlag * 0.0625f) * s_srcPatch[0][1 + ((tidx + 2) / 2)]; // Operation 5

s_dstPatch = sum;

 

 

In the case above, the operation 5 will be ignored. The final sum is incorrect.

 

 

__local float s_srcPatch[10][10];

__local float s_dstPatch[20][16];

float sum;

sum =       (evenFlag * 0.0625f) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx - 2) / 2)];

sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx - 1) / 2)];

sum = sum + (evenFlag * 0.375f ) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx    ) / 2)];

sum = sum + ( oddFlag * 0.25f  ) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx + 1) / 2)];

sum = sum + (evenFlag * 0.0625f) * s_srcPatch[get_local_size(1) - 16][1 + ((tidx + 2) / 2)]; // get_local_size(1) == 16 which take the same effect as above code segment

s_dstPatch = sum;

 

 

If I use an expression "get_local_size(1) - 16", everything goes fine. I can get the correct sum.


Viewing all articles
Browse latest Browse all 2400

Trending Articles