4. Loop Best Practices
The reports generated by the Intel® HLS Compiler Standard Edition let you know if there are any dependencies that prevent it from optimizing your loops. Try to eliminate these dependencies in your code for optimal component performance. You can also provide additional guidance to the compiler by using the available loop pragmas.
- Manually fuse adjacent loop bodies when the instructions in those loop bodies can be performed in parallel. These fused loops can be pipelined instead of being executed sequentially. Pipelining reduces the latency of your component and can reduce the FPGA area your component uses.
- Use the #pragma loop_coalesce directive to have the compiler attempt to collapse nested loops. Coalescing loops reduces the latency of your component and can reduce the FPGA area overhead needed for nested loops.
Tutorials Demonstrating Loop Best Practices
The Intel® HLS Compiler Standard Edition comes with a number of tutorials that give you working examples to review and run so that you can see good coding practices as well as demonstrating important concepts.
|You can find these tutorials in the following location on your Intel® Quartus® Prime system:
|Demonstrates breaking loop-carried dependencies using the ivdep pragma.
|Demonstrates the following versions of a 32-tap finite impulse response (FIR) filter design: