Intel® Fortran Compiler

Developer Guide and Reference

ID 767251
Date 6/30/2025
Public
Document Table of Contents

STRIPE Directive for OpenMP

OpenMP* Fortran Compiler Directive: Suggests to the compiler to replace each of the outermost loops of the associated nest with one that has unit stride and one that has the specified stride.

Syntax

!$OMP STRIPE clause

   loop-nest

[!$OMP END STRIPE]

clause

Is SIZES (size-list). The clause is required, and it may appear only once.

size-list is a list of non-zero integer expressions (s1, … sn). When n is the number of integer expressions in size-list, the depth of the loop nest must be at least n. The STRIPE construct replaces the outer n loops with 2n perfectly nested loops.

loop-nest

Is a nested set of DO loops or a single DO loop in canonical form. The loops don't need to have unit stride.

Description

STRIPE is a pure directive; it can appear in a Fortran PURE procedure.

Striping of loops can provide enhanced performance in offloaded code.

If n is the number of size expressions in the SIZE clause size-list, the STRIPE construct applies to the outer n loops in loop-nest. The n loops l1, … ln affected by the transformation must be perfectly nested and non-rectangular (no loop bounds may depend on the loop control variable of an outer loop).

The n affected loops are replaced by 2n perfectly nested canonical loops. The outer n loops, o1, … on, are called the offsetting loops, and the inner n loops, g1, … gn, are called the grid loops. The value of the expression si specifies the stride value for the grid loop gi. The generated loops are ordered o1, … on g1, … gn.

STRIPE constructs can be nested. If two STRIPE constructs are nested, the result is as if the outer STRIPE construct is applied to the resulting transformed loop nest created by the inner STRIPE construct.

The following examples show how a STRIPE directive transforms a loop nest.

Example

The following example shows the transformation when the outer three loops in a nest of four loops are striped in a STRIPE construct. The loops all have unit stride. The PARALLEL DO construct is applied to the transformed loop nest.

  INTEGER :: arr (16, 20, 24, 10)
  INTEGER :: i, j, k, l
  INTEGER :: i_max, j_max, k_max, l_max
  INTEGER :: i_min, j_min, k_min, l_min
  INTEGER :: size_i, size_j, size_k
  . . . 
  !$OMP PARALLEL DO
  !$OMP STRIPE SIZES (size_i, size_j, size_k) 
  DO i = i_min, i_max
    DO j = j_min, j_max
      DO k = k_min, k_max
        DO l = l_max, l_min
          arr(i,j) = arr(i,j,k,l)*10
        END DO
      END DO
    END DO
  END DO 
  . . . 
 

The resulting transformed code is:

  INTEGER :: arr (16, 20, 24, 10)
  INTEGER :: i, j, k, l
  INTEGER :: i_max, j_max, k_max, l_max
  INTEGER :: i_min, j_min, k_min, l_min
  INTEGER :: stride_i, stride_j, stride_k
  INTEGER :: ii, jj, kk
  . . . 
  !$OMP PARALLEL DO
  DO ii = 0, size_i – 1                       ! Offest loop i
    DO jj = 0, size_j – 1                     ! Offset loop j
      DO kk = size_k - 1                      ! Offset loop k
        DO i = ii + i_min, i_max, size_i      ! Grid loop i
          DO j = jj + j_min, j_max, size_j    ! Grid loop j
            DO k = kk + k_min, k_max, size_k  ! Grid loop k
              DO l = l_max, l_min
                arr(i,j) = arr(i,j,k,l)*10
              END DO
            END DO
          END DO 
        END DO
      END DO
    END DO
  END DO 
  . . . 

The following example shows how a non-unit stride loop would be transformed by a STRIPE construct:

  INTEGER :: arr (64, 64)
  INTEGER :: i, j
  INTEGER :: i_max, i_min, j_max, j_min
  INTEGER :: i_size, size_i
  . . .
  !$OMP STRIPE SIZES (size_i)
  DO i = i_min, i_max, i_size
    DO j = j_min, j_max
      arr(i,j) = arr(i,j) * 10
    END DO
  END DO 
  . . . 

The transformed code is:

  INTEGER :: arr (64, 64)
  INTEGER :: i, j
  INTEGER :: i_max, i_min, j_max, j_min
  INTEGER :: i_size, size_i
  INTEGER :: ii
  . . .
  DO ii = 0, i_size - 1
   DO i = i_min + (ii * i_size), i_max, i_size * size_i
      DO j = j_min, j_max
        arr(i,j) = arr(i,j) * 10
      END DO 
    END DO
  END DO 
  . . . 

See Also