Minimize the Memory Dependencies for Loop Pipelining
- Ensure that theIntel® oneAPIdoes not assume false dependencies.DPC++/C++Compiler
- When the static memory dependence analysis fails to prove that dependency does not exist, theIntel® oneAPIassumes that a dependency exists and modifies the kernel execution to enforce the dependency. The impact of the dependency enforcement is lower if the memory system is stall-free.DPC++/C++Compiler
- Write-after-read operations with data dependency on a load-store unit can take just two clock cycles (II=2). Other stall-free scenarios can take up to seven clock cycles.
- TheIntel® oneAPIcan fully resolve the read-after-write (control dependency) operation.DPC++/C++Compiler
- Override the static memory dependence analysis by adding the line[[intel::ivdep]]before the loop in your kernel code if you are sure that it carries no dependencies. For more information, refer to ivdep Attribute