qopt-assume-no-loop-carried-dep,
Qopt-assume-no-loop-carried-dep
Lets you set a level of performance
tuning for loops.
This content is specific to C++; it does not apply to
DPC++
.Syntax
Linux:
-qopt-assume-no-loop-carried-dep
[=
n
]
Windows:
/Qopt-assume-no-loop-carried-dep
[=
n
]
Arguments
- n
- Is the action for loop-carried dependencies. Possible values are:
- 0
- The compiler does not assume there are no loop carried dependencies. This is the default if this option is not specified.
- 1
- Tells the compiler to assume there are no loop-carried dependencies for innermost loops. This is the default if the option is used butnis not specified.
- 2
- Tells the compiler to assume there are no loop-carried dependencies for all loop levels.
Default
- [q or Q]qopt-assume-no-loop-carried-dep=0
- The compiler does not assume there are no loop carried dependencies.
Description
This option lets you set a level of performance tuning
for loops.
It is useful for C/C++ applications and benchmarks where
pointers and arguments could be aliased. This is because when you specify level
1 or level 2, more loops will be vectorized or benefit from loop
transformations.
This option is applied to all loops in the file. It does
not apply to code outside loops.
IDE Equivalent
None
Alternate Options
None
Examples
The following loop will not be vectorized because of data dependency.
Specifying
[q or Q]opt-assume-no-loop-carried-dep=1
tells the compiler
to assume no data dependence will occur in this loop and it allows this loop to
be vectorized:
void sub (float *A, float *B, int* M ) {
for (int i =0; i< 10000 ; i++) {
A[i] += B[M[i]] + 1;
}
}
In the following example, this matrix multiply kernel will not be
optimized because of dependency in all loop nests. Specifying
[q or Q]opt-assume-no-loop-carried-dep=2
will result in loop
transformations such as blocking, unroll and jam, and vectorization:
void matmul(double *a, double *b, double *c) {
int i, j, k;
int n = 1024;
for (i = 0; i < 1024; i++) {
for (j = 0; j < 1024; j++) {
for (k = 0; k < 1024; k++) {
c[i * n + j] += a[i * n + k] * b[k * n + j];
}
}
}
}