Tutorial

  • 04/11/2022
  • Public Content

Improving Performance with Interprocedural Optimization

The compiler may be able to perform additional optimizations if it is able to optimize across source line boundaries. These may include, but are not limited to, function inlining. This is enabled with the
/Qipo
option.
Rebuild
the program using the
/Qipo
option to enable interprocedural optimization.
Select
Optimization [Intel C++]
Interprocedural Optimization
Multi-file(/Qipo)
.
Note that the vectorization
report now appears in
ipo_out.optrpt
.
LOOP BEGIN at Driver.c(152,9) Driver.c(152,9):remark #15542: loop was not vectorized: inner loop was already vectorized LOOP BEGIN at Multiply.c(37,5) inlined into Driver.c(150,9) Multiply.c(37,5):remark #15542: loop was not vectorized: inner loop was already vectorized LOOP BEGIN at Multiply.c(49,9) inlined into Driver.c(150,9) Multiply.c(50,13):remark #15388: vectorization support: reference a[0][i][j] has aligned access Driver.c(150,9):remark #15388: vectorization support: reference x[j] has aligned access Multiply.c(49,9):remark #15305: vectorization support: vector length 2 Multiply.c(49,9):remark #15399: vectorization support: unroll factor set to 4 Multiply.c(49,9):remark #15309: vectorization support: normalized vectorization overhead 0.594 Multiply.c(49,9):remark #15300: LOOP WAS VECTORIZED Multiply.c(49,9):remark #15448: unmasked aligned unit stride loads: 2 Multiply.c(49,9):remark #15475: --- begin vector cost summary --- Multiply.c(49,9):remark #15476: scalar cost: 9 Multiply.c(49,9):remark #15477: vector cost: 4.000 Multiply.c(49,9):remark #15478: estimated potential speedup: 2.000 Multiply.c(49,9):remark #15488: --- end vector cost summary --- LOOP END LOOP BEGIN at Multiply.c(49,9) inlined into Driver.c(150,9) Remainder loop for vectorization Multiply.c(50,13):remark #15388: vectorization support: reference a[0][i][j] has aligned access Driver.c(150,9):remark #15388: vectorization support: reference x[j] has aligned access Multiply.c(49,9):remark #15335: remainder loop was not vectorized: vectorization possible but seems inefficient. Use vector always directive or /Qvec-threshold0 to override Multiply.c(49,9):remark #15305: vectorization support: vector length 2 Multiply.c(49,9):remark #15309: vectorization support: normalized vectorization overhead 2.417 LOOP END LOOP END LOOP END
Your line and column numbers may be different.
Now, run the executable and record the execution time.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.