Wave Intrinsics

Xᵉ-LP supports the use of wave intrinsics for both 3D and compute workloads. These can be used to write more efficient register-based reductions and to reduce reliance on global or local memory for communication across lanes. This allows threads within the thread group to share information without the use of barriers and to enable other cross lane operations for threads in the same wave. While working with wave intrinsics on Gen11, consider the following:
  • Do not write shaders that assume a specific machine width. On Gen architecture, wave width can vary across shaders from SIMD8, SIMD16, and SIMD32, and is chosen by heuristics in the shader compiler. Because of this, use instructions such as WaveGetLaneCount() in algorithms that depend on wave size.
  • Wave operations can be used to reduce memory bandwidth by enabling access to data already stored in registers by other threads, instead of storing and re-loading results from memory. It is a great fit for optimizing operations such as texture mipmap generation.

