Intel® FPGA SDK for OpenCL™ Standard Edition: Programming Guide

ID 683342
Date 4/22/2019
Public
Document Table of Contents

5.2.1. Unrolling a Loop

Loop unrolling involves replicating a loop body multiple times, and reducing the trip count of a loop. Unroll loops to reduce or eliminate loop control overhead on the FPGA. In cases where there are no loop-carried dependencies and the offline compiler can perform loop iterations in parallel, unrolling loops can also reduce latency and overhead on the FPGA.

The Intel® FPGA SDK for OpenCL™ Offline Compiler might unroll simple loops even if they are not annotated by a pragma.
To direct the offline compiler to unroll a loop, or explicitly not to unroll a loop, insert an unroll kernel pragma in the kernel code preceding a loop you want to unroll.
Attention:
  • Provide an unroll factor whenever possible. To specify an unroll factor N, insert the #pragma unroll <N> directive before a loop in your kernel code.
    The offline compiler attempts to unroll the loop at most <N> times.
    Consider the code fragment below. By assigning a value of 2 as the unroll factor, you direct the offline compiler to unroll the loop twice.
    #pragma unroll 2
    for(size_t k = 0; k < 4; k++)
    {
       mac += data_in[(gid * 4) + k] * coeff[k];
    }
  • To unroll a loop fully, you may omit the unroll factor by simply inserting the #pragma unroll directive before a loop in your kernel code.
    The offline compiler attempts to unroll the loop fully if it understands the trip count. The offline compiler issues a warning if it cannot execute the unroll request.
  • To prevent a loop from unrolling, specify an unroll factor of 1 (that is, #pragma unroll 1).