Intel® oneAPI DPC++/C++ Compiler
Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
High-Level Optimization
High-level Optimizations (HLO) exploit the properties of source code constructs (for example, loops and arrays) in applications developed in high-level programming languages. While the default optimization level, option O2, performs some high-level optimizations, specifying the O3 option provides the best chance for performing loop transformations to optimize memory accesses.
Loop optimizations may result in calls to library routines that can result in additional performance gain on Intel® microprocessors than on non-Intel microprocessors. Additional HLO transformations may be performed for Intel® microprocessors than for non-Intel microprocessors.
Within HLO, loop transformation techniques include:
- Loop Permutation or Interchange 
- Loop Distribution 
- Loop Fusion 
- Loop Unrolling 
- Data Prefetching 
- Scalar Replacement 
- Unroll and Jam 
- Loop Blocking or Tiling 
- Partial-Sum Optimization 
- Predicate Optimization 
- Loop Reversal 
- Profile-Guided Loop Unrolling 
- Loop Peeling 
- Data Transformation: Malloc Combining and Memset Combining, Memory Layout Change 
- Loop Rerolling 
- Memset and Memcpy Recognition 
- Statement Sinking for Creating Perfect Loopnests 
- Multiversioning: Checks include Dependency of Memory References, and Trip Counts 
- Loop Collapsing