4.1. Transferring Data Via Intel® FPGA SDK for OpenCL™ Channels or OpenCL Pipes
Sometimes, FPGA-to-global memory bandwidth constrains the data transfer efficiency between kernels. The theoretical maximum FPGA-to-global memory bandwidth varies depending on the number of global memory banks available in the targeted Custom Platform and board. To determine the theoretical maximum bandwidth for your board, refer to your board vendor's documentation.
In practice, a kernel does not achieve 100% utilization of the maximum global memory bandwidth available. The level of utilization depends on the access pattern of the algorithm.
If global memory bandwidth is a performance constraint for your OpenCL kernel, first try to break down the algorithm into multiple smaller kernels. Secondly, as shown in the figure below, eliminate some of the global memory accesses by implementing the SDK's channels or OpenCL pipes for data transfer between kernels.
For more information about the usage of channels, refer to the Implementing Intel® FPGA SDK for OpenCL™ Channels Extension section of the Intel® FPGA SDK for OpenCL™ Programming Guide.
For more information about the usage of pipes, refer to the Implementing OpenCL Pipes section of the Intel® FPGA SDK for OpenCL™ Programming Guide.
Did you find the information on this page useful?