Offload Fortran Workloads to Intel® GPUs

This video demonstrates key aspects of an HPC developer workflow using Intel® toolkits to program the Intel® Data Center GPU Max Series (formerly code named Ponte Vecchio). Specifically, the demo:

  • Shows a kernel workload on this GPU
  • Demonstrates the new features of accelerating a workload with Intel's newest flagship GPU, all from native Fortran.

The Intel® Fortran Compiler can offload work to Intel Data Center GPUs using a standard parallelism language feature in Fortran 2018 called DO CONCURRENT.

This video covers the high-level architecture of OpenMP* offload to GPUs.

  1. Intel Fortran Compiler processes the OpenMP directives in the Fortran source code. Then, the built-in OpenMP offload runtime library libomptarget calls the Level 0 ZT plug-in, runtime, and the GPU kernel mode driver, and then runs the code on Intel Data Center GPU Max Series.
  2. The latest Intel Fortran Compiler automatically offloads through the Fortran language native-parallelism feature—DO CONCURRENT—to the GPU. By conforming to the DO CONCURRENT requirements for any loop structure, the programmer can even skip adding directives to the source code, and then turn on the fopenmp-target-do-concurrent compiler flag. The compiler automatically generates the OpenMP offload kernel in the background.
  3. A new feature of the Intel Fortran Complier is the ahead-of-time (AOT) compilation. (Previously, the OpenMP offload to GPU only performed a just-in-time [JIT] compilation.) The binary to run on a device was not generated until runtime. Now the executable device binary can be compiled during the compile time for the specified device model. Steps and commands are provided.
  4. Intel Fortan Compiler comes with out-of-the-box robust debugging and profiling features. There is no need to link with another profiler application. The programmer only needs to set the environment variable LIBOMPTARGET_PROFILE=T. After running the executable, useful device performance information such as GPU kernel name, total time and kernel run time, and data-transfer time appears.



Shiquan Su, software technical consulting engineer, Intel


Subscribe Now

Stay in the know on all things CODE. Updates are delivered to your inbox.

Sign Up





Download the Software

Get the latest stand-alone version of Intel Fortran Compiler or as part of the Intel® HPC Toolkit.

The compiler:

  • Is production ready for CPUs and GPUs.
  • Is based on the Intel Fortran Compiler Classic front-end and runtime libraries, but uses LLVM* back-end compiler technology.
  • Implements standard language features of Fortran 2018. Supports FORTRAN 77 to Fortran 2008, all main versions of the Fortran standard, and many OpenMP 4.5, 5.0, and 5.1 directives and offloading features.
  • Provides Fortran programmers access to many capabilities of Intel Data Center GPU Max Series right from their native language.

Port Thermal Solver Code to Intel® Data Center GPU Max Series


Offload Fortran Workloads to New Intel® GPUs Using OpenMP*