Tutorial

  • 04/11/2022
  • Public Content

Improving Performance by Pointer Disambiguation

Two pointers are aliased if both point to the same memory location. Storing to memory using a pointer that might be aliased may prevent some optimizations. For example, it may create a dependency between loop iterations that would make vectorization unsafe. Aliasing is not the only source of potential dependencies. In fact,
Multiply.c
does have other dependencies. In this case however, removal of the dependency created by aliasing allows the compiler to resolve the other loop dependency.
Sometimes, the compiler can generate both a vectorized and a non-vectorized version of a loop and test for aliasing at runtime to select the appropriate code path. If you know that pointers do not alias and inform the compiler, it can avoid the runtime check and generate a single vectorized code path. In
Multiply.c
, the compiler generates runtime checks to determine whether or not the pointer
b
in function
matvec(FTYPE a[][COLWIDTH], FTYPE b[], FTYPE x[])
is aliased to either
a
or
x
. If
Multiply.c
is compiled with the NOALIAS macro, the restrict qualifier of the argument
b
informs the compiler that the pointer does not alias with any other pointer, and in particular that the array
b
does not overlap with
a
or
x
.
The
restrict
qualifier requires the use of either the
/Qrestrict
compiler option for
.c
or
.cpp
files, or the
/Qstd=c99
compiler option for
.c
files.
Remove the NOFUNCCALL preprocessor definition to reinsert the call to
matvec()
. Add the
NOALIAS
preprocessor definition to the compiler options.
Rebuild your project, run the executable, and record the execution time reported in the output.
Multiply.optrpt
shows:
LOOP BEGIN at Multiply.c(37,5) Multiply.c(37,5):remark #15542: loop was not vectorized: inner loop was already vectorized LOOP BEGIN at Multiply.c(49,9) Peeled loop for vectorization LOOP END LOOP BEGIN at Multiply.c(49,9) Multiply.c(49,9):remark #15300: LOOP WAS VECTORIZED LOOP END LOOP BEGIN at Multiply.c(49,9) Alternate Alignment Vectorized Loop LOOP END LOOP BEGIN at Multiply.c(49,9) Remainder loop for vectorization LOOP END LOOP END
Your line and column numbers may be different.
Now that the compiler has been told that the arrays do not overlap, it uses idiom-recognition to resolve the loop dependency and proceeds to vectorize the loop.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.