Creating Heterogeneous Memory Systems in Intel® FPGA SDK for OpenCL Custom Platforms

ID 683654
Date 12/13/2016
Public

1.4. Modifying the Boardtest Program and the Host Code for Your Heterogeneous Memory Solution

Use the boardtest.cl kernel that comes with the Intel® FPGA SDK for OpenCL™ Custom Platform Toolkit to test the functionality and performance of your Custom Platform.

The boardtest program is an OpenCL kernel that allows you to test host-to-device bandwidth, memory bandwidth, and general functionality of your Custom Platform.

  1. Browse to the <path to SDK installation>/board/custom_platform_toolkit/tests/boardtest directory.
  2. Open the boardtest.cl file in a text editor and assign a buffer location to each global memory argument.
    For example:
    __kernel void 
    mem_stream (__global__attribute__((buffer_location("DDR")))  uint *src,
                __global __attribute__((buffer_location("QDR"))) uint *dst,
                uint arg, uint arg2)

    Here, uint *src is assigned to DDR memory, and uint *dst is assigned to QDR memory. The board_spec.xml file specifies the characteristics of both memory systems.

  3. To leverage your heterogeneous memory solution in your OpenCL system, modify your host code by adding the CL_MEM_HETEROGENEOUS_INTELFPGA flag to your clCreateBuffer call.
    For example:
    ddatain = clCreateBuffer(context, 
                      CL_MEM_READ_WRITE | memflags | CL_MEM_HETEROGENEOUS_INTELFPGA,
                      sizeof(unsigned) * vectorSize,
                      NULL,
                      &status);

    Intel® strongly recommends that you set the buffer location as a kernel argument before writing the buffer. When using a single global memory, you can write the buffers either before or after assigning them to a kernel argument. In heterogeneous memory systems, the host sets the buffer location before writting the buffer. In other words, the host will call the clSetKernelArgument function before calling the clEnqueueWriteBuffer function.

    In your host code, invoke the clCreateBuffer, clSetKernelArg, and clEnqueueWriteBuffer calls in the following order:

    ddatain = clCreateBuffer(context,
                      CL_MEM_READ_WRITE | memflags | CL_MEM_HETEROGENEOUS_INTELFPGA,
                      sizeof(unsigned) * vectorSize, NULL, &status); 
    
    … 
    
    status = clSetKernelArg(kernel[k], 0, sizeof(cl_mem), (void*)&ddatain); 
    
    … 
    
    status = clEnqueueWriteBuffer(queue, ddatain, CL_FALSE, 0,
                       sizeof(unsigned) * vectorSize,hdatain, 0, NULL, NULL);

    The ALTERAOCLSDKROOT/board/custom_platform_toolkit/tests/boardtest/host/memspeed.cpp file presents a similar order of these function calls.

  4. After you modify the boardtest.cl file and the host code, compile the host and kernel code and verify their functionality.
    When compiling your kernel code, you must disable burst-interleaving of all memory systems by including the --no-interleaving <global_memory_type> option in the aoc command.