Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

ID 767253
Date 9/08/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

qopt-assume-no-loop-carried-dep, Qopt-assume-no-loop-carried-dep

Lets you set a level of performance tuning for loops.

Syntax

Linux:

-qopt-assume-no-loop-carried-dep[=n]

Windows:

/Qopt-assume-no-loop-carried-dep[=n]

Arguments

n

Is the action for loop-carried dependencies. Possible values are:

0

The compiler does not assume there are no loop carried dependencies. This is the default if this option is not specified.

1

Tells the compiler to assume there are no loop-carried dependencies for innermost loops. This is the default if the option is used but n is not specified.

2

Tells the compiler to assume there are no loop-carried dependencies for all loop levels.

Default

[q or Q]qopt-assume-no-loop-carried-dep=0

The compiler does not assume there are no loop carried dependencies.

Description

This option lets you set a level of performance tuning for loops.

It is useful for C/C++ applications and benchmarks where pointers and arguments could be aliased. This is because when you specify level 1 or level 2, more loops will be vectorized or benefit from loop transformations.

This option is applied to all loops in the file. It does not apply to code outside loops.

IDE Equivalent
None
Alternate Options

None

Examples

The following loop will not be vectorized because of data dependency. Specifying [q or Q]opt-assume-no-loop-carried-dep=1 tells the compiler to assume no data dependence will occur in this loop and it allows this loop to be vectorized:

  void   sub   (float *A,  float *B,  int* M ) {
   for (int i =0; i< 10000 ; i++) {
      A[i]  += B[M[i]] + 1;
     }
   }

In the following example, this matrix multiply kernel will not be optimized because of dependency in all loop nests. Specifying [q or Q]opt-assume-no-loop-carried-dep=2 will result in loop transformations such as blocking, unroll and jam, and vectorization:

  void matmul(double *a, double *b,    double *c) {
   int i, j, k;
   int n = 1024;
   for (i = 0; i < 1024; i++) {
      for (j = 0; j < 1024; j++) {
         for (k = 0; k < 1024; k++) {
           c[i * n + j] += a[i * n + k] * b[k * n + j];
         }
        }
      }
   }