Strategies for Inferring the Accumulator

Developer Guide

Intel oneAPI FPGA Handbook

Download PDF

ID 785441

Date 2/07/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-A226CEFD-3D71-4E6E-9F78-5E6A722A3043

View Details

Strategies for Inferring the Accumulator

To leverage the single cycle floating-point accumulator feature, you can modify the accumulator description in your kernel code to improve efficiency or work around programming restrictions.

Describe an Accumulator Using Multiple Loops

Consider a case where you want to describe an accumulator using multiple loops, with some of the loops being unrolled:


float acc = 0.0f;
for (i = 0; i < k; i++) {
  #pragma unroll
  for (j = 0; j < 16; j++)
    acc += (x[i+j]*y[i+j]);
}

With fast math enabled by default, the Intel® oneAPI DPC++/C++ Compiler automatically rearranges operations in a way that exposes the accumulation.

Modify a Multi-Loop Accumulator Description

If you want an accumulator to be inferred even when using -fp-model=precise, rewrite your code to expose the accumulation..

For the code example above, rewrite it in the following manner:


float acc = 0.0f;
for (i = 0; i < k; i++) {
  float my_dot = 0.0f;
  #pragma unroll
  for (j = 0; j < 16; j++)
    my_dot += (x[i+j]*y[i+j]);
  acc += my_dot;
}

Modify an Accumulator Description Containing a Variable or Non-Zero Initial Value

Consider a situation where you might want to apply an offset to a description of an accumulator that begins with a non-zero value:


float acc = array[0];
for (i = 0; i < k; i++) {
  acc += x[i];
}

Because the accumulator hardware does not support variable or non-zero initial values in a description, you must rewrite the description.


float acc = 0.0f;
for (i = 0; i < k; i++) {
  acc += x[i];
}
acc += array[0];

Rewriting the description in the above manner enables the kernel to use an accumulator in a loop. The loop structure is then followed by an increment of array[0].

Parent topic: Single-Cycle Floating-Point Accumulator for Single Work-Item Kernels

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel oneAPI FPGA Handbook

Strategies for Inferring the Accumulator

Describe an Accumulator Using Multiple Loops

Modify a Multi-Loop Accumulator Description

Modify an Accumulator Description Containing a Variable or Non-Zero Initial Value