unroll/nounroll
Tells the compiler to
unroll or not to unroll a counted loop.
Syntax
#pragma unroll
#pragma unroll
(
n
)
#pragma nounroll
Arguments
- n
- The unrolling factor representing the number of times to unroll a loop; it must be an integer constant from 0 through 255.
Description
The
unroll
[n]
pragma tells the compiler
how many times to unroll a counted loop.
The
unroll
pragma must precede the
for
statement for each
for
loop it affects. If
n
is specified, the
optimizer unrolls the loop
n
times. If
n
is omitted or if
it is outside the allowed range, the optimizer assigns the number of times to
unroll the loop.
This pragma is supported only when option
O3
is set. The
unroll
pragma overrides any setting
of loop unrolling from the command line.
The pragma can be applied for the innermost loop nest as
well as for the outer loop nest. If applied to outer loop nests, the current
implementation supports complete outer loop unrolling. The loops inside the
loop nest are either not unrolled at all or completely unrolled. The compiler
generates correct code by comparing
n
and the loop
count.
When unrolling a loop increases register pressure and
code size it may be necessary to prevent unrolling of a loop. In such cases,
use the
nounroll
pragma. The
nounroll
pragma instructs the
compiler not to unroll a specified loop.
The
unroll
and
nounroll
pragmas are supported in both host and device
code.
Target device
support: CPU, GPU, FPGA.
Examples
Use the
unroll
pragma for innermost
loop unrolling:
void unroll(int a[], int b[], int c[], int d[]) {
#pragma unroll(4)
for (int i = 1; i < 100; i++) {
b[i] = a[i] + 1;
d[i] = c[i] + 1;
}
}
Use the
unroll
pragma for outer loop
unrolling:
int m = 0;
int dir[4]= {1,2,3,4};
int data[10];
#pragma unroll (4) // outer loop unrolling
for (int i = 0; i < 4; i++) {
for (int j = dir[i]; data[j]==N ; j+=dir[i])
m++;
}
When you place the
unroll
pragma before the first
for
loop, it causes the compiler to unroll the outer loop
completely. If an
unroll
pragma is placed before
the inner
for
loop as well as before the outer
for
loop, the compiler ignores the inner
for
loop
unroll
pragma. If the
unroll
pragma is placed only
for the innermost loop, the compiler unrolls the innermost loop according to
some factor.