Development Reference Guides

Contents

Ahead of Time Compilation

Ahead of Time (AOT) Compilation is a helpful feature for your development lifecycle or distribution time. It benefits you when you know beforehand what your target device is going to be at application execution time. The AOT feature provides the following benefits:
  • No additional compilation time is done when running your application.
  • No just-in-time (JIT) bugs encountered due to compilation for the target device, because this step is skipped with AOT compilation.
  • Your final code, executing on the target device, can be tested as-is before you deliver it to end-users.
The program built with AOT compilation for a specific target device will not run on a non-specific device. You must detect the proper target device at runtime and report an error if the targeted device is not present. The use of exception handling with an asynchronous exception handler is recommended.
Data Parallel C++ (
DPC++
)
supports AOT compilation for the following targets: Intel® CPUs, Intel® Processor Graphics (Gen9 or above), and Intel® FPGA.

Prerequisites

To target a GPU with the AOT feature, you must have the OpenCL™ Offline Compiler (OCLOC) tool installed. OCLOC can generate binaries that utilize OpenCL™ or the Intel® oneAPI Level Zero backend.
Linux*
OCLOC is not packaged with the Linux version of
Intel® oneAPI
DPC++/C++
Compiler
and must be installed separately. Refer to Install OpenCL™ Offline Compiler (OCLOC) for details.
Windows*
OCLOC is packaged with the Windows version of
Intel® oneAPI
DPC++/C++
Compiler
.

Use AOT for the Target Device (Intel® CPUs)

The supported options are:
  • -fsycl-targets=spir64_x86_64
  • -Xs "-march=<arch>"
    , where
    <arch>
    is one of the following:
    Switch
    Display Name
    avx
    Intel® Advanced Vector Extensions (Intel® AVX)
    avx2
    Intel® Advanced Vector Extensions 2 (Intel® AVX2)
    avx512
    Intel® Advanced Vector Extensions 512 (Intel® AVX-512)
    sse4.2
    Intel® Streaming SIMD Extensions 4.2 (Intel® SSE4.2)
-Xs
is a general device target option. If there are multiple targets desired (example:
-fsycl-targets=spir64_gen,spir64_x86_64
) the use of
-Xs "opt"
applies to all targets. This is not desired for multiple targets. You can use
-Xsycl-target-backend=spir64_gen "opt"
and
-Xsycl-target-backend=spir64_x86_64 "opt"
to add specificity.
Examples:
  • Linux:
    dpcpp -fsycl-targets=spir64_x86_64 -Xs "-march=avx2" main.cpp
  • Windows:
    dpcpp-cl /EHsc -fsycl-targets=spir64_x86_64 -Xs "-march=avx2" test_cpu.cpp
Build an Application with Multiple Source Files for CPU Targeting
Method 1:
Compile your normal files (with no
DPC++
kernels) to create host objects. Then compile the file with the kernel code and link it with the rest of the application.
  • Linux:
    1. dpcpp -c main.cpp
    2. dpcpp -fsycl-targets=spir64_x86_64 -Xs "-march=avx2" mandel.cpp main.o
  • Windows:
    1. dpcpp-cl -c /EHsc main.cpp
    2. dpcpp-cl /EHsc -fsycl-targets=spir64_x86_64 -Xs "-march=avx2" mandel.cpp main.obj
Method 2:
Compile the file with the kernel code and create a fat object. Then compile the rest of the files and linking to create a fat executable:
  • Linux:
    1. dpcpp -c -fsycl-targets=spir64_x86_64 -Xs "-march=avx2" mandel.cpp
    2. dpcpp main.cpp mandel.o -fsycl-targets=spir64_x86_64 -Xs "-march=avx2"
  • Windows:
    1. dpcpp-cl -c /EHsc -fsycl-targets=spir64_x86_64 -Xs "-march=avx2" mandel.cpp
    2. dpcpp-cl /EHsc main.cpp mandel.obj -fsycl-targets=spir64_x86_64 -Xs "-march=avx2"
Currently, Method 2 only works on a HOST selector.

Use AOT for Intel® Integrated Graphics (Intel® GPU)

The supported options are:
  • -fsycl-targets=spir64_gen
  • -Xs "-device <arch>"
    option, where
    <arch>
    is the target device. Possible values:
    Switch
    Display Name
    skl
    6th generation Intel® Core™ Processor (Skylake with Intel® Processor Graphics Gen9)
    kbl
    7th generation Intel® Core™ Processor (Kaby Lake with Intel® Processor Graphics Gen9)
    cfl
    8th generation Intel® Core™ Processor (Coffee Lake with Intel® Processor Graphics Gen9)
    glk
    Gemini Lake with Intel® Processor Graphics Gen9
    icllp
    10th generation Intel® Core™ Processor (Ice Lake with Intel® Processor Graphics Gen11)
    tgllp
    11th generation Intel® Core™ Processor (Tiger Lake with Intel® Processor Graphics Gen12)
    dg1
    Intel® Iris® X
    e
    MAX graphics
    Gen9
    Intel® Processor Graphics Gen9
    Gen11
    Intel® Processor Graphics Gen11
    Gen12LP
    Intel® Processor Graphics Gen12 (Lower Power)
    adls
    12th generation Intel® Core™ Processor (Alder Lake S with Intel® Processor Graphics Gen12.2)
    aldp
    12th generation Intel® Core™ Processor (Alder Lake P with Intel® Processor Graphics Gen12.2)
To see the complete list of supported target device types for your installed version of OCLOC, run:
ocloc compile --help
If multiple target devices are listed in the compile command, the
Intel® oneAPI
DPC++/C++
Compiler
compiles for each of these targets and creates a fat-binary that contains all the device binaries produced this way.
Examples of supported
-device
patterns:
  • Linux:
    • To compile for a single target, using
      skl
      as an example, use:
      dpcpp -fsycl-targets=spir64_gen -Xs "-device skl" vector-add.cpp
    • To compile for two targets, using
      skl
      and
      icllp
      as examples, use:
      dpcpp -fsycl-targets=spir64_gen -Xs "-device skl,icllp" vector-add.cpp
    • To compile for all the targets known to OCLOC, use:
      dpcpp -fsycl-targets=spir64_gen -Xs "-device *" vector-add.cpp
  • Windows:
    • To compile for a single target, using
      skl
      as an example, use:
      dpcpp-cl /EHsc -fsycl-targets=spir64_gen -Xs "-device skl" vector-add.cpp
    • To compile for two targets, using
      skl
      and
      icllp
      as examples, use:
      dpcpp-cl /EHsc -fsycl-targets=spir64_gen -Xs "-device skl,icllp" vector-add.cpp
    • To compile for all the targets known to OCLOC, use:
      dpcpp-cl /EHsc -fsycl-targets=spir64_gen -Xs "-device *" vector-add.cpp
Build an Application with Multiple Source Files for GPU Targeting
Method 1:
Compile your normal files (with no
DPC++
kernels) to create host objects. Then compile the file with the kernel code and link it with the rest of the application.
  • Linux:
    1. dpcpp -c main.cpp
    2. dpcpp -fsycl-targets=spir64_gen -Xs "-device *" mandel.cpp main.o
  • Windows:
    1. dpcpp-cl -c /EHsc main.cpp
    2. dpcpp-cl /EHsc -fsycl-targets=spir64_gen -Xs "-device *" mandel.cpp main.obj
Method 2:
Compile the file with the kernel code and create a fat object. Then compile the rest of the files and linking to create a fat executable:
  • Linux:
    1. dpcpp -c -fsycl-targets=spir64_gen mandel.cpp
    2. dpcpp main.cpp mandel.o -fsycl-targets=spir64_gen -Xs "-device *"
  • Windows:
    1. dpcpp-cl -c /EHsc -fsycl-targets=spir64_gen mandel.cpp
    2. dpcpp-cl /EHsc main.cpp mandel.obj -fsycl-targets=spir64_gen -Xs "-device *"
Currently, Method 2 only works on a HOST selector.

Use AOT in Microsoft Visual Studio*

You can use Microsoft Visual Studio for compiling and linking. Set the flags below to use AOT compilation for CPU or GPU.
For CPU:
  • To compile, in the dialog box, select:
    Configuration Properties
    DPC++
    General
    Specify SYCL offloading targets for AOT compilation
  • To link, in the dialog box, select:
    Configuration Properties
    Linker
    General
    Specify CPU Target Device for AOT compilation
For GPU:
  • To compile, in the dialog box, select:
    Configuration Properties
    DPC++
    General
    Specify SYCL offloading targets for AOT compilation
  • To link, in the dialog box, select:
    Configuration Properties
    Linker
    General
    Specify GPU Target Device for AOT compilation

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.