User Defined Induction in OpenMP* with Intel® C++ Compiler
Published: 09/11/2018
Last Updated: 10/04/2018
Contents
- Induction Overview
- User Defined Induction(UDI)
- Parallelization Example
- Vectorization Example
Induction Overview
Intel® C++ Compiler 19.0 update 1 supports General Induction, a proposed OpenMP* 5.0 feature as an extension to the existing linear clause. With the linear clause, OpenMP provides a way to specify linear inductive variables with respect to the loop index. However, there are significant limitations with the linear clause: the variables are restricted to be of integral or pointer types, the step has to be of integral type, and the induction operation is limited to addition. The newly proposed induction clause provides a mechanism to express general induction that allows more data types and induction operations, including user-defined types and operations.
The induction clause with the following syntax is to be used with the OpenMP loop, distribute and simd constructs:
induction(induction-id:list:step)
induction-id: can be a built-in op(+,-,*,/) or user-defined
list variables can be of integral, FP and non-POD types
step: the step expression must be supported by the induction op
Here are some examples of the induction clause:
- induction( + : x, y : 1 ) is equivalent to linear( x, y ) if x, y are integral
- induction( * : x : s ) describes the nonlinear induction x_{i} = x_{i-1} * s
- induction( foo : x : s ) uses a user-defined induction operator “foo”; x and s can be of different non-POD types
Note: The induction clause can also be used in a short form induction(list[:step]) similar to the existing linear clause. The omitted induction-id is the language’s built-in + operator. Variables in list must be of a type supported by the built-in + and if step is omitted, it is assumed to be 1. This helps in easy porting of existing code with the linear clause.
Following is an example for using induction to evaluate a polynomial (i.e., compute Σ^{N}_{i=0}_{ }c_{i}x^{i})
#define N 10
int main()
{
float c[N]; // values of the coefficients
float x= 1.23F; // value of x to evaluate the polynomial
float xi = 1.0F; // x^i; initial value x^0 == 1
float value = 0.0; // accumulator for the result
#pragma omp simd reduction(+:value) induction(* : xi : x)
for(int i=0; i<=N; i++) {
value += c[i] * xi;
xi *= x;
}
return 0;
}
User Defined Induction(UDI)
To express induction beyond the built-in operators and data types, a declare induction directive is proposed that is syntactically similar to the declare reduction directive:
#pragma omp declare induction ( induction-id : induction-type : step-type : inductor ) [collector( collector )]
- induction-id : identifier for the operation, to be used in an induction clause
- induction-type : type specifier for the induction variables
- step-type : type specifier for the step expression
- inductor: specifies the inductive operation: x = x + s
- Uses keywords omp_out to represent x and omp_step for s
- C++ Example: omp_out = omp_out + omp_step, where + is overloaded
- C Example: add(&omp_out, omp_step)
- collector: closed form is xi = x0 + ( s * i )
- Uses keywords omp_step to represent s and omp_index for i
- C++ Example: omp_step = omp_step * omp_index, where * may be overloaded
- C Example: cs(&omp_step, omp_index)
Parallelization Example :
The C parallelization example below uses UDI to express an induction involving the struct a. The inductor is the function add. The collector is provided, so the initial value of a for each thread is the closed-form computed by calling add(&a,5*lb), where lb is the lower-bound of the index for the thread.
typedef struct{ float x; int y; } A;
void add(A *a, int st) { a->x += st; a->y += st; }
#pragma omp declare induction( op : A : int : add(&omp_out, omp_step)) \
collector( omp_step = omp_index * omp_step )
A a = {12.3, 456};
#pragma omp parallel for induction( op : a : 5 )
for(int i=0; i<N; i++) { work(a); add(&a, 5); }
Vectorization Example:
The C++ vectorization example below uses UDI to express an induction involving a non-POD variable and step. The + operator in the inductor and the * operator in the collector are overloaded to support classes A and S.
class A; // class of the induction variable
class S; // class of the step expression
#pragma omp declare induction ( op2 : A : S : omp_out = omp_out + omp_step ) \
collector ( omp_step = omp_index * omp_step )
...
A a; S s; // initialized by constructors
...
#pragma omp simd induction( op2 : a : s )
for(int i=0; i<N; i++) { work(a); a=a+s; }
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.