Intel® FPGA SDK for OpenCL™ Pro Edition: Programming Guide

ID 683846
Date 3/28/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

6.6.4.1. Programming Multiple FPGA Devices

If you install multiple FPGA devices in your system, you can direct the host runtime to program a specific FPGA device by modifying your host code.
Important:

Linking your host application to FCD allows you to target multiple FPGA devices from different Custom Platforms. However, this feature has limited support for Custom Platforms that are compatible with SDK versions prior to 16.1.

You can present up to 128 FPGA devices to your system in the following manner:

  • Multiple FPGA accelerator boards, each consisting of a single FPGA.
  • Multiple FPGAs on a single accelerator board that connects to the host system via a PCIe® switch.
  • Combinations of the above.

The host runtime can load kernels onto each and every one of the FPGA devices. The FPGA devices can then operate in a parallel fashion.

Probing the OpenCL FPGA Devices

The host must identify the number of OpenCL™ FPGA devices installed into the system.
  1. To query a list of FPGA devices installed in your machine, invoke the aocl diagnose command.
  2. To direct the host to identify the number of OpenCL FPGA devices, add the following lines of code to your host application:
    //Get the platform
    ciErrNum = clGetPlatformID(&cpPlatform);
    
    //Get the devices
    ciErrNum = clGetDeviceIDs(cpPlatform,
                              CL_DEVICE_TYPE_ALL,
                              0,
                              NULL,
                              &ciDeviceCount);
    cdDevices = (cl_device_id * )malloc(ciDeviceCount * sizeof(cl_device_id));
    ciErrNum = clGetDeviceIDs(cpPlatform, 
                              CL_DEVICE_TYPE_ALL,
                              ciDeviceCount,
                              cdDevices,
                              NULL);
    
For example, on a system with two OpenCL FPGA devices, ciDeviceCount has a value of 2, and cdDevices contains a list of two device IDs (cl_device_id).

Querying Device Information

You can direct the host to query information on your OpenCL™ FPGA devices.
To direct the host to output a list of OpenCL FPGA devices installed into your system, add the following lines of code to your host application:
char buf[1024];
for (unsigned i = 0; i < ciDeviceCount; i++);
{
    clGetDeviceInfo(cdDevices[i], CL_DEVICE_NAME, 1023, buf, 0);
    printf("Device %d: '%s'\n", i, buf);
}
When you query the device information, the host lists your FPGA devices in the following manner:

Device <N>: <board_name>: <name_of_FPGA_board>

Where:

  • <N> is the device number.
  • <board_name> is the board designation you use to target your FPGA device when you invoke the aoc command.
  • <name_of_FPGA_board> is the advertised name of the FPGA board.

For example, if you have two identical FPGA boards on your system, the host generates an output that resembles the following:

Device 0: board_1: Stratix V FPGA Board
Device 1: board_1: Stratix V FPGA Board
Note: The clGetDeviceInfo function returns the board type (for example, board_1) that the Intel® FPGA SDK for OpenCL™ Offline Compiler lists on-screen when you invoke the aoc -list-boards command. If your accelerator board contains more than one FPGA, each device is treated as a "board" and is given a unique name.

Loading Kernels for Multiple FPGA Devices

If your system contains multiple FPGA devices, you can create specific cl_program objects for each FPGA and load them into the OpenCL™ runtime.

The following host code demonstrates the usage of the clCreateProgramWithBinary and createMultiDeviceProgram functions to program multiple FPGA devices:

cl_program createMultiDeviceProgram(cl_context context,
                                    const cl_device_id *device_list,
                                    cl_uint num_devices,
                                    const char *aocx_name);

// Utility function for loading file into Binary String
//
unsigned char* load_file(const char* filename, size_t *size_ret)
{
   FILE *fp = fopen(aocx_name,"rb");  
   fseek(fp,0,SEEK_END);
   size_t len = ftell(fp);
   char *result = (unsigned char*)malloc(sizeof(unsigned char)*len);
   rewind(fp);
   fread(result,len,1,fp);
   fclose(fp);   
   *size_ret = len;
   return result;
}

//Create a Program that is compiled for the devices in the "device_list"
//
cl_program createMultiDeviceProgram(cl_context context, 
                                    const cl_device_id *device_list, 
                                    cl_uint num_devices,
                                    const char *aocx_name)
{
    printf("creating multi device program %s for %d devices\n",
           aocx_name, num_devices);
    const unsigned char **binaries =
       (const unsigned char**)malloc(num_devices*sizeof(unsigned char*));
    size_t *lengths=(size_t*)malloc(num_devices*sizeof(size_t));
    cl_int err;
    
    for(cl_uint i=0; i<num_devices; i++)
    {
       binaries[i] = load_file(aocx_name,&lengths[i]);
       if (!binaries[i])
       {
          printf("couldn't load %s\n", aocx_name);
          exit(-1);
       }
    }

    cl_program p = clCreateProgramWithBinary(context, 
                                             num_devices, 
                                             device_list,
                                             lengths,
                                             binaries,
                                             NULL,
                                             &err);
    free(lengths);
    free(binaries);
    
    if (err != CL_SUCCESS)
    {
       printf("Program Create Error\n");
    }  
    return p;
}


// main program 

main () 
{
   // Normal OpenCL setup 
}
program = createMultiDeviceProgram(context,
                                   device_list,
                                   num_devices,
                                   "program.aocx");
clBuildProgram(program,num_devices,device_list,options,NULL,NULL);