Developer Guide
Intel® oneAPI DPC++/C++ Compiler Handbook for FPGAs
A newer version of this document is available. Customers should click here to go to the newest version.
Visible to Intel only — GUID: GUID-36F3AF8F-D65C-4356-9943-B1A7A5542533
Visible to Intel only — GUID: GUID-36F3AF8F-D65C-4356-9943-B1A7A5542533
Loops
The Intel® oneAPI DPC++/C++ Compiler attempts to maximize the occupancy of the datapath of a loop within a task kernel by executing iterations in a pipeline parallel method. The following sections provide guidelines and describe techniques for writing loops in task kernels such that the Intel® oneAPI DPC++/C++ Compiler can best extract pipeline parallelism from these loops.
- Refactor the Loop-Carried Data Dependency
- Relax Loop-Carried Dependency
- Transfer Loop-Carried Dependency to Local Memory
- Minimize the Memory Dependencies for Loop Pipelining
- Unroll Loops
- Fuse Loops to Reduce Overhead and Improve Performance
- Optimize Loops With Loop Speculation
- Remove Loop Bottlenecks
- Improve fMAX/II with Shannonization
- Optimize Inner Loop Throughput
- Improve Loop Performance by Caching Data in On-Chip Memory