Product Version: Intel® Fortran Compiler 15.0 and a later version
Cause:
A vectorizable loop contains loads from memory locations that are not contiguous in memory (sometimes known as a “gather”). These may be indexed loads, as in the example below, or loads with non-unit stride. The compiler has emulated a hardware gather instruction by issuing individual loads for the different memory locations in software.
The vectorization report generated using Intel® Fortran Compiler's optimization and vectorization report options:
Windows* OS: /O2 /Qopt-report:4 /Qopt-report-phase:vec
Linux OS or OS X: -O2 -qopt-report=4 -qopt-report-phase=vec
Example:
An example below will generate the following remark in optimization report:
subroutine gathr(n, a, b, index)
implicit none
integer, parameter :: RT=8
integer, intent(in) :: n
integer, dimension(n), intent(in) :: index
real(RT), dimension(n), intent(in) :: a
real(RT), dimension(n), intent(out) :: b
integer :: i
do i=1,n
b(i) = 1.0_RT + 0.1_RT*a(index(i))
enddo
end subroutine gathr
$ ifort -c -qopt-report=4 -qopt-report-file=stdout gathr.f90
When using Intel Fortran compiler version 16.0 the following remark is generated:
[...]
remark #15328: vectorization support: gather was emulated for the variable a: indirect access [ gathr.f90(11,31) ]
[...]
When using Intel Fortran compiler version 17.0 the following remark is generated:
[...]
remark #15328: vectorization support: irregularly indexed load was emulated for the variable <a(index(i))>, part of index is read from memory [ gathr.f90(11,31) ]
[...]
The compiler has vectorized the loop by emulating a “gather” instruction in software.
The assembly code contains no gather instructions.
See also:
Requirements for Vectorizable Loops
Vectorization and Optimization Reports