Intel® C++ Compiler Classic Developer Guide and Reference

ID 767249
Date 7/13/2023
Public
Document Table of Contents

par-schedule, Qpar-schedule

Lets you specify a scheduling algorithm for loop iterations.

Syntax

Linux:

-par-schedule-keyword[=n]

macOS:

-par-schedule-keyword[=n]

Windows:

/Qpar-schedule-keyword[[:]n]

Arguments

keyword

Specifies the scheduling algorithm or tuning method. Possible values are:

auto

Lets the compiler or run-time system determine the scheduling algorithm.

static

Divides iterations into contiguous pieces.

static-balanced

Divides iterations into even-sized chunks.

static-steal

Divides iterations into even-sized chunks, but allows threads to steal parts of chunks from neighboring threads.

dynamic

Gets a set of iterations dynamically.

guided

Specifies a minimum number of iterations.

guided-analytical

Divides iterations by using exponential distribution or dynamic distribution.

runtime

Defers the scheduling decision until run time.

n

Is the size of the chunk or the number of iterations for each chunk. This setting can only be specified for static, dynamic, and guided. For more information, see the descriptions of each keyword below.

Default

static-balanced

Iterations are divided into even-sized chunks and the chunks are assigned to the threads in the team in a round-robin fashion in the order of the thread number.

Description

This option lets you specify a scheduling algorithm for loop iterations. It specifies how iterations are to be divided among the threads of the team.

This option is only useful when specified with option [Q]parallel.

This option affects performance tuning and can provide better performance during auto-parallelization. It does nothing if it is used with option [q or Q]openmp.

Option

Description

[Q]par-schedule-auto

Lets the compiler or run-time system determine the scheduling algorithm. Any possible mapping may occur for iterations to threads in the team.

[Q]par-schedule-static

Divides iterations into contiguous pieces (chunks) of size n. The chunks are assigned to threads in the team in a round-robin fashion in the order of the thread number. Note that the last chunk to be assigned may have a smaller number of iterations.

If no n is specified, the iteration space is divided into chunks that are approximately equal in size, and each thread is assigned at most one chunk.

[Q]par-schedule-static-balanced

Divides iterations into even-sized chunks. The chunks are assigned to the threads in the team in a round-robin fashion in the order of the thread number.

[Q]par-schedule-static-steal

Divides iterations into even-sized chunks, but when a thread completes its chunk, it can steal parts of chunks assigned to neighboring threads.

Each thread keeps track of L and U, which represent the lower and upper bounds of its chunks respectively. Iterations are executed starting from the lower bound, and simultaneously, L is updated to represent the new lower bound.

[Q]par-schedule-dynamic

Can be used to get a set of iterations dynamically. Assigns iterations to threads in chunks as the threads request them. The thread executes the chunk of iterations, then requests another chunk, until no chunks remain to be assigned.

As each thread finishes a piece of the iteration space, it dynamically gets the next set of iterations. Each chunk contains n iterations, except for the last chunk to be assigned, which may have fewer iterations. If no n is specified, the default is 1.

[Q]par-schedule-guided

Can be used to specify a minimum number of iterations. Assigns iterations to threads in chunks as the threads request them. The thread executes the chunk of iterations, then requests another chunk, until no chunks remain to be assigned.

For a chunk of size 1, the size of each chunk is proportional to the number of unassigned iterations divided by the number of threads, decreasing to 1.

For an n with value k (greater than 1), the size of each chunk is determined in the same way with the restriction that the chunks do not contain fewer than k iterations (except for the last chunk to be assigned, which may have fewer than k iterations). If no n is specified, the default is 1.

[Q]par-schedule-guided-analytical

Divides iterations by using exponential distribution or dynamic distribution. The method depends on run-time implementation. Loop bounds are calculated with faster synchronization and chunks are dynamically dispatched at run time by threads in the team.

[Q]par-schedule-runtime

Defers the scheduling decision until run time. The scheduling algorithm and chunk size are then taken from the setting of environment variable OMP_SCHEDULE.

NOTE:

This option may behave differently on Intel® microprocessors than on non-Intel microprocessors.

IDE Equivalent

None

Alternate Options

None