Intel® High Level Synthesis Compiler Pro Edition: Best Practices Guide

ID 683152
Date 10/04/2021
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

5.6. Convert Nested Loops into a Single Loop

To maximize performance, combine nested loops into a single loop whenever possible. The control flow for a loop adds overhead both in logic required and FPGA hardware footprint. Combining nested loops into a single loop reduces these aspects, improving the performance of your component.

The following code examples illustrate the conversion of a nested loop into a single loop:

Nested Loop Converted Single Loop
for (i = 0; i < N; i++)
{
    //statements
    for (j = 0; j < M; j++)
    {
        //statements          
    }
    //statements
} 
for (i = 0; i < N*M; i++)
{
    //statements
}

You can also specify the loop_coalesce pragma to coalesce nested loops into a single loop without affecting the loop functionality. The following simple example shows how the compiler coalesces two loops into a single loop when you specify the loop_coalesce pragma.

Consider a simple nested loop written as follows:
#pragma loop_coalesce
for (int i = 0; i < N; i++)
 for (int j = 0; j < M; j++)
  sum[i][j] += i+j;
The compiler coalesces the two loops together so that they run as if they were a single loop written as follows:
int i = 0;
int j = 0;
while(i < N){

  sum[i][j] += i+j;
  j++;
  
  if (j == M){
    j = 0;
    i++;
  }
}

For more information about the loop_coalesce pragma, see "Loop Coalescing (loop_coalesce Pragma)" in Intel® High Level Synthesis Compiler Pro Edition Reference Manual.

You can also review the following tutorial: <quartus_installdir>/hls/examples/tutorials/best_practices/loop_coalesce