Intel® FPGA SDK for OpenCL™ Standard Edition: Programming Guide

ID 683342
Date 4/22/2019
Public
Document Table of Contents

5.5.5.7.1. Defining Memory Consistency Across Kernels When Using Pipes

According to the OpenCL™ Specification version 2.0, memory behavior is undefined unless a kernel completes execution. A kernel must finish executing before other kernels can visualize any changes in memory behavior. However, kernels that use pipes can share data through common global memory buffers and synchronized memory accesses. To ensure that data written to a pipe is visible to the read pipe after execution passes a memory fence, define memory consistency across kernels with respect to memory fences.
To create a control flow dependency between the pipe synchronization calls and the memory operations, add the CLK_GLOBAL_MEM_FENCE flag to the mem_fence call.
For example:
__kernel void producer (__global const uint * restrict src,
                        const uint iterations,
                        write_only pipe uint __attribute__((blocking)) c0,
                        write_only pipe uint __attribute__((blocking)) c1)
{
    for (int i = 0; i < iterations; i++)
    {
        write_pipe(c0, &src[2*i]);
        mem_fence(CLK_CHANNEL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
        write_pipe(c1, &src[2*i+1]);
    }
}

In this kernel, the mem_fence function ensures that the write operation to c0 and memory access to src[2*i] occur before the write operation to c1 and memory access to src[2*i+1]. This allows data written to c0 to be visible to the read pipe before data is written to c1.