Vectorize a Loop Using the _Simd Keyword
In this section we introduce the
_Simd
keyword, which provides an alternative to the
simd
pragma. Just like the
simd
pragma, the
_Simd
keyword modifies a serial
for
loop for vectorization. The syntax is as follows:
_Simd [_Safelen(constant-expression)][_Reduction (reduction-identifier : list)]
The
_Simd
keyword and any clauses should come after the
for
keyword as in this example:
for _Simd (int i=0; i<10; i++){ // loop body }
Differences between the
simd
pragma and
_Simd
keyword:
- Omission of theprivateandlastprivateclauses of thesimdpragma construct because C and C++ already have variable-scoping rules that allow a programmer to cleanly declare a private variable within the scope of a loop iteration
- Thelinearclause is omitted because the ability to increment multiple variables makes it unnecessary. See the following example:
float add_floats(float *a, float *b, int n){ int i=0; int j=0; float sum=0; for _Simd _Reduction(+:sum) (i=0; i<n; i++, j+=2){ a[i] = a[i] + b[j]; sum += a[i]; } return sum; }
To ensure that your loop is vectorized keep the following in mind:
- The countable loop for the _Simd keyword has to conform to the for-loop style of an OpenMP* canonical loop form except that multiple variables may be incremented in the incr-expr (See the OpenMP* specification at www.openmp.org).
- The loop control variable must be a signed integer type.
- The vector values should be signed 8-, 16-, 32-, or 64-bit integers, single or double-precision floating point numbers, or single or double-precision complex numbers.
- You cannot use any control constructs to jump into or out of a SIMD loop. That includes thebreak,return,goto, andthrowconstructs.
- A SIMD loop may contain another loop (for,while,do-while) in it, butgotoout of such inner loops is not supported. You may usebreakandcontinuewith the inner loop.
- A SIMD loop performs memory references unconditionally. Therefore, all address computations must result in valid memory addresses, even though such locations may not be accessed if the loop is executed sequentially