Pipelining Loops in Non-task Kernels (-Xsauto-pipeline)
-Xsauto-pipeline
)To direct the
Intel® oneAPI
to compile your design and pipeline loops in non-task (DPC++/C++
Compilerparallel_for
) kernels, include the
-Xsauto-pipeline
option in your
dpcpp
command. The host program invokes non-task kernels through the kernel execution function
parallel_for
,
parallel_for_work_item
, or
parallel_for_work_group
.
Example
dpcpp -fintelfpga –Xshardware -Xsauto-pipeline <source_file>.cpp
With the
option, the compiler does not pipeline the loops in
-Xsauto-pipeline
option, the compiler attempts to pipeline the loops in your design, but the pipelining is not guaranteed. If you do not include the
-Xsauto-pipeline
parallel_for
kernels. However, it executes different work items in parallel.
The
-Xsauto-pipeline
option might improve or degrade performance depending on the memory access pattern in your design.
- If the auto-pipelining is successful, the Loop Analysis report displays the messageAuto-pipelined parallel_forandparallel_for rewritten as a pipelined single_task(Details pane) . The compiler-generated loops appear marked asCompiler generated auto-pipeline loopin the report.
- If the compiler chooses not to auto-pipeline the loops, the Loop Analysis report displays a message for the kernel. The reasons for not auto-pipelining a loop can be one of the following:
- A barrier in the function is not at the top-level function scope.
- Kernel uses a local or private memory.
- Kernel uses a volatile or atomic memory, or channels.
If you do not want the compiler to pipeline some infrequently used loops while allowing other loops to be auto-pipelined, use the
[[intel::disable_loop_pipelining]]
loop directive on specific loops when using the
-Xsauto-pipeline
option. This loop directive disables the loop pipelining.