Intel® Fortran Compiler Classic and Intel® Fortran Compiler Developer Guide and Reference

ID 767251
Date 3/22/2024
Public
Document Table of Contents

ATOMIC

OpenMP* Fortran Compiler Directive: Ensures that a specific memory location is updated atomically. This prevents the possibility of multiple threads simultaneously reading and writing the specific memory location.

Syntax

!$OMP ATOMIC [clause[[[,] clause]...]]

   block

[!$OMP END ATOMIC]

clause

(Optional) Is one of the following:

  • An atomic-clause, which is one of the following:

    • READ

    • UPDATE

    • WRITE

    For details on the effects of these clauses, see the table in the Description section.

  • A memory-order-clause, which is one of the following:

    • ACQ_REL

      Specifies both an ACQUIRE and a RELEASE flush.

    • ACQUIRE

      Forces consistency between views of memory of two synchronizing threads by discarding any value of a shared variable in its temporary view, which the thread has not written since last performing a RELEASE flush.

      It also will reload any value of a shared variable propagated by a RELEASE flush that synchronizes with it.

    • RELAXED

      Permits a thread's local or temporary view of memory, which may be held in registers, cache, or other local memory, to be temporarily inconsistent with the memory, which may have been updated by another thread.

      Note that a strong flush operation forces consistency between memory and a threads temporary view and memory, and it restricts reordering of memory operations that may otherwise be performed. RELAXED memory ordering has no implicit flushes.

    • RELEASE

      Forces consistency between views of memory of two synchronizing threads by guaranteeing that any prior READ or WRITE of a shared variable will appear complete before any READ or WRITE of the same shared variable that follows an ACQUIRE flush that is synchronized with a RELEASE flush.

      The RELEASE flush propagates the values of all shared variables in its temporary view of memory prior to the thread performing a subsequent atomic operation that establishes a synchronization.

    • SEQ_CST

      Specifies that the construct is a sequentially consistent atomic construct. Unlike non-sequentially consistent atomic constructs, sequentially consistent atomic constructs preserve the interleaving (sequentially consistent) behavior of correct, data-race-free programs.

      However, sequentially consistent atomic constructs are not designed to replace the FLUSH directive as a mechanism to enforce ordering for non-sequentially consistent atomic constructs. Attempts to do so require extreme caution.

      For example, a sequentially consistent ATOMIC WRITE construct may appear to be reordered with a subsequent non-sequentially consistent ATOMIC WRITE construct because such reordering would not be observable by a correct program if the second WRITE was outside an ATOMIC construct.

    If a memory-order-clause is present, or implicitly provided by a REQUIRES directive, it specifies the effective memory ordering; otherwise, the effective memory ordering is RELAXED.

  • Or one of the following:

    • CAPTURE

      Causes an atomic update to x to occur using the specified operator or intrinsic. The original or final value of the location x is captured and written to the storage location v.

      Only the READ and WRITE of the location specified by x are performed mutually atomically. The evaluation of expr or expr-list and the write to v need not be atomic with respect to the READ and WRITE of x.

    • COMPARE

      Specifies that the atomic update is a conditional atomic update. If the equality operator is used, the operation is an atomic compare and swap.

      The values of x and e are compared and if equal, the value of d is written to x. The original or final value of x is written to v, which may be the same as e.

      Only the READ and WRITE of x is performed atomically; neither the comparison nor the writes to v need be atomic with respect to the READ or WRITE of x.

    • HINT (hint-expression)
    • FAIL (SEQ_CAT | ACQUIRE |RELAXED)

      Specifies that its parameter overrides the effective memory ordering used when the comparison for a conditional update fails.

    • WEAK

      Indicates that the comparison performed by an atomic compare and swap may falsely fail, evaluating to not equal even when the values are equal.

block

Is one of the following:

  • statement

  • or if CAPTURE is also specified, it can be the following:

    • statement

    • capture-statement

    The order is not important. capture-statement can appear before statement.

statement

Is one of the following:

  • capture-statement - if atomic-clause is READ, or atomic-clause is UPDATE with CAPTURE also specified

  • compare-statement - if the COMPARE clause is present

  • update-statement - if atomic-clause is UPDATE

  • write-statement - if atomic-clause is WRITE, or atomic-clause is UPDATE with CAPTURE also specified

capture-statement

Is an expression in the form v = x.

compare-statement

Is as follows:

  if (x == e) then 
     x = d
  end 

or:

  if (x == e) x = d

or if CAPTURE also appears and block contains no capture-statement it can also be the following:

  if (x == e) then 
     x = d
  else
     v = x 
  end if 

update-statement

Is an expression with one of the following forms:

  x = x operator expr
  x = expr operator x
  x = intrinsic (x, expr-list)
  x = intrinsic (expr-list, x)

The following rules apply:

  • Operators in expr must have precedence equal to or greater than the precedence of operator, and cannot be defined operators.

  • xoperatorexpr must be mathematically equivalent to xoperator (expr). This requirement is satisfied if the operators in expr have precedence greater than operator, or by using parentheses around expr or subexpressions of expr.

  • exproperatorx must be mathematically equivalent to (expr) operatorx. This requirement is satisfied if the operators in expr have precedence equal to or greater than operator, or by using parentheses around expr or subexpressions of expr.

  • All assignments must be intrinsic assignments.

write-statement

Is an expression in the form x = expr.

d, e, x, v

Are scalar variables of intrinsic type. During execution of an atomic region, all references to storage location x must specify the same storage location.

v must not access the same storage location as x.

expr, expr-list

expr is a scalar expression. expr-list is a comma-separated list of expressions. They must not access the same storage location as x or v.

If intrinsic is IAND, IOR, or IEOR, then expr-list can contain only one expression.

operator

Is one of the following intrinsic operators: +, *, -, /, ,AND., ,OR., .EQV., or .NEQV..

intrinsic

Is one of the following intrinsic procedures: MAX, MIN, IAND, IOR, or IEOR.

If x is of size 8, 16, 32, or 64 bits and x is aligned to a multiple of its size, the binding thread set is all threads on the device. Otherwise, the binding thread set is all threads in the contention group. Atomic regions enforce exclusive access with respect to other atomic regions that access the same storage location x among all the threads in the binding thread set without regard to the teams to which the threads belong.

If !$OMP ATOMIC is specified with no atomic-clause, it is the same as specifying !$OMP ATOMIC UPDATE.

If !$OMP ATOMIC CAPTURE is specified, you must include an !$OMP END ATOMIC directive following the block. Otherwise, the !$OMP END ATOMIC directive is optional.

Note that the following restriction applies to the ATOMIC directive:

  • All atomic accesses to the storage locations designated by x throughout the program must have the same type and type parameters.

The following table describes what happens when you specify one of the values in the atomic-clause in an ATOMIC construct.

Clause Result

READ

Causes an atomic read of the location designated by x regardless of the native machine word size.

UPDATE

Causes an atomic update of the location designated by x using the designated operator or intrinsic. The following rules also apply:

  • The evaluation of expr or expr-list need not be atomic with respect to the READ or WRITE of the location designated by x.

  • No task scheduling points are allowed between the READ and the WRITE of the location designated by x.

WRITE

Causes an atomic write of the location designated by x regardless of the native machine word size.

If all of the following conditions are true, the strong flush on entry to the atomic operation is also a RELEASE flush:

  • The atomic-clause is WRITE or UPDATE.

  • The atomic operation is not a conditional update for which the comparison fails.

  • The effective memory ordering is RELEASE, ACQ_REL, or SEQ_CST.

If both of the following conditions are true, the strong flush on exit from the atomic operation is also an ACQUIRE flush:

  • The atomic-clause is READ or UPDATE.

  • The effective memory ordering is ACQUIRE, ACQ_REL, or SEQ_CST.

Therefore, as the above shows, the effective memory ordering is not RELAXED. RELEASE and ACQUIRE flushes can be implied and permit synchronization between threads without an explicit FLUSH directive.

Any combination of two or more of these atomic constructs enforces mutually exclusive access to the locations designated by x.

A race condition exists when two unsynchronized threads access the same shared variable with at least one thread modifying the variable; this can cause unpredictable results. To avoid race conditions, all accesses of the locations designated by x that could potentially occur in parallel must be protected with an ATOMIC construct.

Atomic regions do not guarantee exclusive access with respect to any accesses outside of atomic regions to the same storage location x even if those accesses occur during a CRITICAL or ORDERED region, while an OpenMP* lock is owned by the executing task, or during the execution of a REDUCTION clause.

However, other OpenMP* synchronization can ensure the desired exclusive access. For example, a BARRIER directive following a series of atomic updates to x guarantees that subsequent accesses do not form a race condition with the atomic accesses.

Example

The following example shows a way to avoid race conditions by using ATOMIC to protect all simultaneous updates of the location by multiple threads.

Since the ATOMIC directive below applies only to the statement immediately following it, elements of Y are not updated atomically.


   REAL FUNCTION FOO1(I)
      INTEGER I
      FOO1 = 1.0 * I
      RETURN
   END FUNCTION FOO1

   REAL FUNCTION FOO2(I)
      INTEGER I
      FOO2 = 2.0 * I 
      RETURN
   END FUNCTION FOO2

   SUBROUTINE SUB(X, Y, INDEX, N)
      REAL X(*), Y(*)
      INTEGER INDEX(*), N
      INTEGER I
!$OMP PARALLEL DO SHARED(X, Y, INDEX, N)
      DO I=1,N
!$OMP ATOMIC UPDATE
         X(INDEX(I)) = X(INDEX(I)) + FOO1(I)
         Y(I) = Y(I) + FOO2(I)
      ENDDO
   END SUBROUTINE SUB

   PROGRAM ATOMIC_DEMO
      REAL X(1000), Y(10000)
      INTEGER INDEX(10000)
      INTEGER I
      DO I=1,10000 
         INDEX(I) = MOD(I, 1000) + 1
         Y(I) = 0.0
      ENDDO
      DO I = 1,1000
         X(I) = 0.0
      ENDDO
      CALL SUB(X, Y, INDEX, 10000)
   END PROGRAM ATOMIC_DEMO

The following non-conforming example demonstrates the restriction on the ATOMIC construct:


   SUBROUTINE ATOMIC_INCORRECT()
      INTEGER:: I
      REAL:: R
      EQUIVALENCE(I,R)
!$OMP PARALLEL
!$OMP ATOMIC UPDATE
      I = I + 1
!$OMP ATOMIC UPDATE
      R = R + 1.0
! The above is incorrect because I and R reference the same location
! but have different types
!$OMP END PARALLEL
   END SUBROUTINE ATOMIC_INCORRECT