A newer version of this document is available. Customers should click here to go to the newest version.
- Disable Burst-Interleaving of Global Memory (<span class='codeph'>-Xsno-interleaving=<global_memory_type></span>)
- Disable Hardware Kernel Invocation Queue (<span class='codeph'>-Xsno-hardware-kernel-invocation-queue</span>)
- Modify the Handshaking Protocol Between Clusters (<span class='codeph'>-Xshyper-optimized-handshaking</span>)
- Fuse Adjacent Loops With Unequal Trip Counts (<span class='codeph'>-Xsenable-unequal-tc-fusion</span>)
- Control Semantics of Floating-Point Operations (<span class='codeph'>-fp-model=<var><value></var> </span>)
- Modify the Rounding Mode of Floating-point Operations (<span class='codeph'>-Xsrounding=<rounding_type></span>)
- Global Control of Exit FIFO Latency of Stall-free Clusters (<span class='codeph'>-Xssfc-exit-fifo-type=<var><value></var> </span>)
- Enable the Read-Only Cache for Read-Only Accessors (<span class='codeph'>-Xsread-only-cache-size=<var><N></var>)</span>
- Omit Hardware to Support the <span class='codeph'>no_global_work_offset</span> Attribute in <span class='codeph'>parallel_for</span> Kernels
Global Memory Accesses Optimization
The Intel® oneAPI DPC++/C++ Compiler uses SDRAM as global memory. By default, the compiler configures global memory in a burst-interleaved configuration. The Intel® oneAPI DPC++/C++ Compiler interleaves global memory across each of the external memory banks.
In most circumstances, the default burst-interleaved configuration leads to the best load balancing between memory banks. However, in some cases, you might want to partition the banks manually as two non-interleaved (and contiguous) memory regions to achieve better load balancing.
The following figure illustrates the difference in memory mapping patterns between burst-interleaved and non-interleaved memory partitions:
- Global Memory Bandwidth Use Calculation
- Manual Partition of Global Memory
- Partitioning Buffers Across Different Memory Types (Heterogeneous Memory)
- Partitioning Buffers Across Memory Channels of the Same Memory Type
- Ignoring Dependencies Between Accessor Arguments
- Contiguous Memory Accesses
- Static Memory Coalescing
Did you find the information on this page useful?