block_loop/noblock_loop
Enables or disables
loop blocking for the immediately following nested loops. block_loop enables
loop blocking for the nested loops. noblock_loop disables loop blocking for the
nested loops.
Syntax
#pragma block_loop
[
clause
[
,
clause
]
...
]
#pragma noblock_loop
Arguments
- clause
- Can be any of the following:
- factor(expr)
- expris a positive scalar constant integer expression representing the blocking factor for the specified loops. This clause is optional. If thefactorclause is not present, the blocking factor will be determined based on processor type and memory access patterns and will be applied to the specified levels in the nested loop following the pragma.At most only onefactorclause can appear in ablock_looppragma.
- level(level_expr[,level_expr]... )
- level_expris specified in the formconst1orconst1:const2whereconst1is a positive integer constantm<= 8 representing the loop at levelm, where the immediate following loop is level 1. Theconst2is a positive integer constantn<= 8 representing the loop at leveln, wheren>m.const1:const2represents the nested loops from levelconst1throughconst2.
The clauses can be specified in any order. If you do not specify any clause, the compiler chooses the best blocking factor to apply to all levels of the immediately following nested loop.
Description
The
block_loop
pragma lets you exert greater control over
optimizations on a specific loop inside a nested loop.
Using a technique called loop blocking, the
block_loop
pragma separates large iteration counted loops
into smaller iteration groups. Execution of these smaller groups can increase
the efficiency of cache space use and augment performance.
If there is no
level
and
factor
clause, the blocking factor will be determined based
on the processor's type and memory access patterns and it will apply to all the
levels in the nested loops following this pragma.
You can use the
noblock_loop
pragma to tune the performance by disabling
loop blocking for nested loops.
The loop-carried dependence is ignored during the
processing of
block_loop
pragmas.
The
block_loop
pragma is supported in host code only.
#pragma block_loop factor(256) level(1) /* applies blocking factor 256 to */
#pragma block_loop factor(512) level(2) /* the top level loop in the following
nested loop and blocking factor 512 to
the 2nd level (1st nested) loop */
#pragma block_loop factor(256) level(2)
#pragma block_loop factor(512) level(1) /* levels can be specified in any order */
#pragma block_loop factor(256) level(1:2) /* adjacent loops can be specified as a range */
#pragma block_loop factor(256) /* the blocking factor applies to all levels
of loop nest */
#pragma block_loop /* the blocking factor will be determined based on
processor type and memory access patterns and will
be applied to all the levels in the nested loop
following the directive */
#pragma noblock_loop /* None of the levels in the nested loop following this
directive will have a blocking factor applied */
Consider the following:
#pragma block_loop factor(256) level(1:2)
for (j = 1 ; j<n ; j++){
f = 0 ;
for (i =1 ;i<n i++){
f = f + a[i] * b [i] ;
}
c [j] = c[j] + f ;
}
The above code produces the following result after loop blocking:
for ( jj=1 ; jj<n/256+1 ; jj+){
for ( ii = 1 ; ii<n/256+1 ;ii++){
for ( j = (jj-1)*256+1 ; min(jj*256, n) ;j++){
f = 0 ;
for ( i = (ii-1)*256+1 ;i<min(ii*256,n) ;i++){
f = f + a[i] * b [i];
}
c[j] = c[j] + f ;
}
}
}