Visible to Intel only — GUID: mwh1391806062355
Ixiasoft
Visible to Intel only — GUID: mwh1391806062355
Ixiasoft
5.4.4. Restrictions in the Implementation of Intel® FPGA SDK for OpenCL™ Channels Extension
Multiple Channel Call Site
__kernel void k1() {
read_channel_intel (channel1);
read_channel_intel (channel1);
read_channel_intel (channel1);
}
__kernel void k1(){
write_channel_intel (channel1, 1);
}
__kernel void k2() {
write_channel_intel (channel1, 2);
}
Feedback and Feed-forward Channels
Performance of a kernel that has multiple accesses (reads or writes) to the same channel might be poor.
Static Indexing
The Intel® FPGA SDK for OpenCL™ channels extension does support indexing into arrays of channel IDs, but it leads to inefficient hardware.
Consider the following example:
channel int ch[WORKGROUP_SIZE];
__kernel void consumer()
{
int gid = get_global_id(0);
int value = read_channel_intel(ch[gid]);
//statements
}
Compilation of this example generates the following warning message:
Compiler Warning: Dynamic access into channel array ch was expanded into predicated static accesses on every channel of the array.
If the access is dynamic and you know that only a subset of the channels in the array can be accessed, you can generate slightly more efficient hardware with a switch statement:
channel int ch[WORKGROUP_SIZE];
__kernel void consumer() {
int gid = get_global_id(0);
int value;
switch(gid)
{
case 0: value = read_channel_intel(ch[0]); break;
case 2: value = read_channel_intel(ch[2]); break;
case 3: value = read_channel_intel(ch[3]); break;
//statements
case WORKGROUP_SIZE-1:read_channel_intel(ch[WORKGROUP_SIZE-1]); break;
}
//statements
}
Kernel Vectorization Support
You cannot vectorize kernels that use channels; that is, do not include the num_simd_work_items kernel attribute in your kernel code. Vectorizing a kernel that uses channels creates multiple channel accesses inside the same kernel and requires arbitration, which negates the advantages of vectorization. As a result, the SDK's channel extension does not support kernel vectorization.
Instruction-Level Parallelism on read_channel_intel and write_channel_intel Calls
If no data dependencies exist between read_channel_intel and write_channel_intel calls, the offline compiler attempts to execute these instructions in parallel. As a result, the offline compiler might execute these read_channel_intel and write_channel_intel calls in an order that does not follow the sequence expressed in the OpenCL kernel code.
Consider the following code sequence:
in_data1 = read_channel_intel(channel1);
in_data2 = read_channel_intel(channel2);
in_data3 = read_channel_intel(channel3);
Because there are no data dependencies between the read_channel_intel calls, the offline compiler can execute them in any order.