Intel® FPGA SDK for OpenCL™ Pro Edition: Programming Guide

ID 683846
Date 3/28/2022

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

6.2.1. Partitioning Buffers Across Multiple Interfaces of the Same Memory Type

Before you partition the memory across multiple interfaces of the same memory type, you must first disable burst-interleaving during OpenCL™ kernel compilation. Then, in the host application, you must specify the memory bank to which you allocate the OpenCL buffer.
Tip: For oneAPI SYCL-specific instructions, refer to Global Memory Accesses Optimization topic in the Intel® oneAPI DPC++ FPGA Optimization Guide.

By default, the Intel® FPGA SDK for OpenCL™ Offline Compiler configures each global memory type in a burst-interleaved fashion. Usually, the burst-interleaving configuration leads to the best load balancing between the memory banks. However, there might be situations where it is more efficient to partition the memory into non-interleaved regions.

The figure below illustrates the differences between burst-interleaved and non-interleaved memory partitions.

To manually partition some or all of the available global memory types, perform the following tasks:

  1. Compile your OpenCL kernel using the -no-interleaving=<global_memory_type> flag to configure the memory bank(s) of the specified memory type as separate addresses. For more information about the use of the -no-interleaving=<global_memory_type> flag, refer to the Disabling Burst-Interleaving of Global Memory (-no-interleaving=<global_memory_type>) section.
  2. Create an OpenCL buffer in your host application, and allocate the buffer to one of the banks using the CL_CHANNEL flags.
    • Specify CL_CHANNEL_1_INTELFPGA to allocate the buffer to the lowest available memory region.
    • Specify CL_CHANNEL_2_INTELFPGA to allocation memory to the second bank (if available).
    Attention: Allocate each buffer to a single memory bank only. If the second bank is not available at runtime, the memory is allocated to the first bank. If no global memory is available, the clCreateBuffer call fails with the error message CL_MEM_OBJECT_ALLOCATION_FAILURE.