Intel® High Level Synthesis Compiler Standard Edition: Best Practices Guide

ID 683259
Date 12/18/2019
Public
Document Table of Contents

4. Loop Best Practices

The Intel® High Level Synthesis Compiler pipelines your loops to enhance throughput. Review these loop best practices to learn techniques to optimize your loops to boost the performance of your component.

The reports generated by the Intel® HLS Compiler Standard Edition let you know if there are any dependencies that prevent it from optimizing your loops. Try to eliminate these dependencies in your code for optimal component performance. You can also provide additional guidance to the compiler by using the available loop pragmas.

As a start, try the following techniques:
  • Manually fuse adjacent loop bodies when the instructions in those loop bodies can be performed in parallel. These fused loops can be pipelined instead of being executed sequentially. Pipelining reduces the latency of your component and can reduce the FPGA area your component uses.
  • Use the #pragma loop_coalesce directive to have the compiler attempt to collapse nested loops. Coalescing loops reduces the latency of your component and can reduce the FPGA area overhead needed for nested loops.

Tutorials Demonstrating Loop Best Practices

The Intel® HLS Compiler Standard Edition comes with a number of tutorials that give you working examples to review and run so that you can see good coding practices as well as demonstrating important concepts.

Review the following tutorials to learn about loop best practices that might apply to your design:
Tutorial Description
You can find these tutorials in the following location on your Intel® Quartus® Prime system:
<quartus_installdir>/hls/examples/tutorials
best_practices/ loop_memory_dependency Demonstrates breaking loop-carried dependencies using the ivdep pragma.
best_practices/ resource_sharing_filter Demonstrates the following versions of a 32-tap finite impulse response (FIR) filter design:
  • optimized-for-throughput variant
  • optimized-for-area variant