Intel® FPGA SDK for OpenCL™ Standard Edition: Programming Guide

ID 683342
Date 4/22/2019
Public
Document Table of Contents

12.3.1.2. Omit Communication Hardware between the Host and the Kernel

The autorun kernel attribute instructs the Intel® FPGA SDK for OpenCL™ Offline Compiler to omit logic that is used for communication between the host and the kernel. A kernel that uses the autorun attribute starts executing automatically before any kernel that the host launches explicitly. In addition, this kernel restarts automatically as soon as it finishes its execution.

The autorun kernel attribute notifies the offline compiler that the kernel runs on its own and will not be enqueued by any host.

To leverage the autorun attribute, a kernel must meet all of the following criteria:

  1. Does not use I/O channels
    Note: Kernel-to-kernel channels are supported.
  2. Does not have any arguments
  3. Has either the max_global_work_dim(0) attribute or the reqd_work_group_size(X,Y,Z) attribute
    Note: The parameters of the reqd_work_group_size(X,Y,Z) attribute must be divisors of 232.

As mentioned above, kernels with the autorun attribute cannot have any arguments and start executing without the host launching them explicitly. As a result, the offline compiler does not need to generate the logic for communication between the host and the kernel. Omitting this logic reduces logic utilization and allows the offline compiler to apply additional performance optimizations.

A typical use case for the autorun attribute is a kernel that reads data from one or more kernel-to-kernel channels, processes the data, and then writes the results to one or more channels. When compiling the kernel, the offline compiler will generate hardware as illustrated in Single Work-Item Kernel with No Interface Hardware.

channel int chan_in;
channel int chan_out;

__attribute__((max_global_work_dim(0)))
__attribute__((autorun))
__kernel void plusOne () {
    while(1) {
        int data_in = read_channel_intel(chan_in);
        write_channel_intel(chan_out, data_in + 1);	
    }
}
Figure 20. Single Work-Item Kernel with No Interface Hardware