Intel® High Level Synthesis Compiler Standard Edition: Best Practices Guide

ID 683259
Date 12/18/2019
Public
Document Table of Contents
Give Feedback

4.4. Minimize Loop-Carried Dependencies

Loop-carried dependencies occur when the code in a loop iteration depends on the output of previous loop iterations. Loop-carried dependencies in your component increase loop initiation interval (II), which reduces the performance of your component.

The loop structure below has a loop-carried dependency because each loop iteration reads data written by the previous iteration. As a result, each read operation cannot proceed until the write operation from the previous iteration completes. The presence of loop-carried dependencies reduces of pipeline parallelism that the Intel® HLS Compiler Standard Edition can achieve, which reduces component performance.

for(int i = 1; i < N; i++)
{
    A[i] = A[i - 1] + i;
}

The Intel® HLS Compiler performs a static memory dependency analysis on loops to determine the extent of parallelism that it can achieve. If the Intel® HLS Compiler cannot determine that there are no loop-carried dependencies, it assumes that loop-dependencies exist. The ability of the compiler to test for loop-carried dependencies is impeded by unknown variables at compilation time or if array accesses in your code involve complex addressing.

To avoid unnecessary loop-carried dependencies and help the compiler to better analyze your loops, follow these guidelines:

Avoid Pointer Arithmetic

Compiler output is suboptimal when your component accesses arrays by dereferencing pointer values derived from arithmetic operations. For example, avoid accessing an array as follows:

for(int i = 0; i < N; i++)
{
    int t = *(A++);
    *A = t;
}

Introduce Simple Array Indexes

Some types of complex array indexes cannot be analyzed effectively, which might lead to suboptimal compiler output. Avoid the following constructs as much as possible:
  • Nonconstants in array indexes.

    For example, A[K + i], where i is the loop index variable and K is an unknown variable.

  • Multiple index variables in the same subscript location.

    For example, A[i + 2 × j], where i and j are loop index variables for a double nested loop.

    The array index A[i][j] can be analyzed effectively because the index variables are in different subscripts.

  • Nonlinear indexing.

    For example, A[i & C], where i is a loop index variable and C is a nonconstant variable.

Use Loops with Constant Bounds Whenever Possible

The compiler can perform range analysis effectively when loops have constant bounds.

Ignore Loop-Carried Dependencies

If there are no implicit memory dependencies across loop iterations, you can use the ivdep pragma to tell the Intel® HLS Compiler Standard Edition to ignore the memory dependency.

For details about how to use the ivdep pragma, see Loop-Carried Dependencies (ivdep Pragma) in the Intel® High Level Synthesis Compiler Standard Edition Reference Manual.