Single-Cycle Floating-Point Accumulator for Single Work-Item Kernels
Single work-item kernels that perform accumulation in a loop can leverage the single-cycle floating-point accumulator feature of the
. The compiler searches for these kernel instances and attempts to map an accumulation that executes in a loop into the accumulator structure.
The compiler supports an accumulator that adds or subtracts a value. To leverage this feature, describe the accumulation in a way that allows the compiler to infer the accumulator, which must be part of a loop, must have an initial value of 0, and cannot be conditional.
The accumulator is available only on Intel® Arria® 10 devices.