Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 12/16/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

parallel/noparallel

Resolves dependencies to facilitate auto-parallelization of the immediately following loop (parallel) or prevents auto-parallelization of the immediately following loop (noparallel).

Syntax

#pragma parallel [clause[ [,]clause]...]

#pragma noparallel

Arguments

clause

Can be any of the following:

always [assert]

Overrides compiler heuristics that estimate whether parallelizing a loop would increase performance. Using this clause on a loop that the compiler finds to be parallelizable tells the compiler to parallelize the loop even if doing so might not improve performance.

If assert is added, the compiler will generate an error-level assertion test to display a message saying that the compiler efficiency heuristics indicate that the loop cannot be vectorized.

firstprivate ( var [ :expr ] ... )text

Provides a superset of the functionality provided by the private clause. Variables that appear in a firstprivate list are subject to private clause semantics. In addition, its initial value is broadcast to all private instances upon entering the parallel loop.

lastprivate (var [ :expr ] ... )

Provides a superset of the functionality provided by the private clause. Variables that appear in a lastprivate list are subject to private clause semantics. In addition, when the parallel region is exited, each variable has the value that results from the sequentially last iteration of the loop up exiting the parallel loop.

num_threads (n)

Parallelizes the loop across n threads, where n is an integer.

private ( var [ :expr ] ...)

Specifies a list of scalar and array variables (var) to privatize. An array or pointer variable can take an optional argument (expr) which is an int32 or int64 expression denoting the number of array elements to privatize.

Like the private clause, both the firstprivate, and the lastprivate clauses specify a list of scalar and array variables (var) to privatize. An array or pointer variable can take an optional argument (expr) which is an int32 or int64 expression denoting the number of array elements to privatize.

The same var is not allowed to appear in both the private and the lastprivate clauses for the same loop.

The same var is not allowed to appear in both the private and the firstprivate clauses for the same loop.

When expr is absent, the rules on var are the same as with OpenMP. The rules to be observed are as follows:

  • var must not be part of another variable (as an array or structure element)

  • var must not have a const-qualified type unless it is of class type with a mutable member

  • var must not have an incomplete type or a reference type

  • if var is of class type (or array thereof), then it requires an accessible, unambiguous default constructor for the class type. Furthermore, if this var is in a lastprivate clause, then it also requires an accessible, unambiguous copy assignment operator for the class type.

When expr is present, the same rules apply, but var must be an array or a pointer variable.

  • If var is an array, then only its first expr elements are privatized. Without expr, the entire array is privatized.

  • If var is a pointer, then the first expr elements are privatized (element size given by the pointer’s target type). Without expr, only the pointer variable itself is privatized.

  • Program behavior is undefined if expr evaluates to a non-positive value, or if it exceeds the array size.

Description

The parallel pragma instructs the compiler to ignore potential dependencies that it assumes could exist and which would prevent correct parallelization in the immediately following loop. However, if dependencies are proven, they are not ignored.

The noparallel pragma prevents autoparallelization of the immediately following loop.

These pragmas take effect only if autoparallelization is enabled by the [Q]parallel compiler option. Using this option enables parallelization for both Intel® microprocessors and non-Intel microprocessors. The resulting executable may get additional performance gain on Intel® microprocessors than on non-Intel microprocessors. The parallelization can also be affected by certain options, such as the arch, m, or [Q]x compiler options.

CAUTION:

Use this pragma with care. If a loop has cross-iteration dependencies, annotating it with this pragma can lead to incorrect program behavior.

Only use the parallel pragma if it is known that parallelizing the annotated loop will improve its performance.

Examples

This example shows how to use the parallel pragma:

void example(double *A, double *B, double *C, double *D) {
  int i;
  #pragma parallel
  for (i=0; i<10000; i++) {
    A[i] += B[i] + C[i];
    C[i] += A[i] + D[i];
  } 
}