Developer Guide

FPGA Optimization Guide for Intel® oneAPI Toolkits

ID 767853
Date 7/13/2023
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Omit Hardware that Generates and Dispatches Kernel IDs

The [[intel::max_global_work_dim(0)]] kernel attribute instructs the Intel® oneAPI DPC++/C++ Compiler to omit logic that generates and dispatches global, local, and group IDs into the compiled kernel.

Semantically, the [[intel::max_global_work_dim(0)]] kernel attribute specifies that the global work dimension of the kernel is zero. Setting this kernel attribute means that the kernel does not use any global, local, or group IDs. The presence of this attribute in the kernel code serves as a guarantee to the compiler that the kernel is a single work-item kernel.

When compiling the following kernel, the compiler generates interface hardware as illustrated in Figure 1:

cgh.single_task<class kernelComputeAsTask>(
  [=]()
    [[intel::max_global_work_dim(0)]] {
      for (unsigned i = 0; i < SIZE; i++) {
        accessorRes[i] = accessorIdx[i] * 2; 
      }
    });
NOTE:

The [[intel::max_global_work_dim(0)]] attribute must be run as a task and not as a parallel_for function.

Compiler-generated Interface Hardware for a Kernel with the [[intel::max_global_work_dim(0)]] Attribute

If your current kernel implementation has multiple work-items but does not use global, local, or group IDs, you can use the [[intel::max_global_work_dim(0)]] kernel attribute if you modify the kernel code accordingly:

  1. Wrap the kernel body in a for loop that iterates as many times as the number of work-items.
  2. Use cgh.single_task<kernelName> to invoke the device code.