Intel® oneAPI DPC++/C++ Compiler Release Notes

Last Updated: 01/04/2022

This document provides a summary of new and changed product features and includes notes about features and problems not described in the product documentation.

Where to Find the Release

Please follow the steps to download the toolkit from the Web Configurator, and follow the installation instructions to install.

2022.0 Release

New Features and Improvements

  • Vectorization for OpenMP SIMD was previously supported at O2 or above when OpenMP language features are enabled. It is now supported at O0 and above if OpenMP language features are enabled (e.g., -qopenmp, -qopenmp-simd)
  • -fopenmp-target-simd to enable OpenMP SIMD support for GPU
  • -fopenmp-target-simdlen=n to specify GPU vector length for OpenMP SIMD loop
  • Added support for Target in_reduction clause from OpenMP 5.0 standard
  • Support for masked construct and tile construct from OpenMP 5.1 standard
  • nowait for asynchronous offloading

  • Added support for new SYCL 2020 features sycl::logical_and and sycl::logical_or and completed support for Host Task. A complete list of SYCL 2020 features supported can be found here.
  • Added the following DPC++ Extensions:
  • Removed support for deprecated SYCL 1.2.1 APIs as listed here.
  • Support of SYCL half type in the global namespace has been removed to avoid potential conflicts with the user-defined type. This was previously an alias to the sycl::half type. To resolve compilation failures due to missing ::half type sycl::half type must be used directly.
  • Added an experimental feature to speed up increamental build time of DPC++ applications which can be enabled using the compiler option -fsycl-max-parallel-link-jobs=<N>. This option tells the compiler that it can simultaneously spawn up to the specified number of processes to perform actions required to link DPC++ applications.
  • Previous compiler releases included all LLVM tools in its bin directory. When added to PATH, some of these binaries were found to unexpectedly conflict with other LLVM installations on the system, so they are moved to a sibling bin-llvm directory. Compiler drivers (dpcpp/icx/icpx/ifx) are adjusted to find these internal tools as necessary, typically transparently to users. However, we recognize that there may be cases where the tools which are no longer in PATH were being invoked directly in some application Makefiles (or cmake configuration) and this may require adjustment. Please refer to <…/bin/>../bin-llvm/README for more details.
  • Compiler now uses Windows registry as the default mechanism to discover the backend OpenCL ICDs on Windows. OCL_ICD_FILENAMES environment variable is for debug only and does not work for administrative privilage on Windows. 
  • Added support for the -Xssfcexit-fifo-type=<value> flag that globally controls exit FIFO latency of stall-free clusters in FPGA.
  • Added support for the nofusion loop attribute that prevents a loop from being fused with an adjacent loop in FPGA.
  • Added support for the -Xsread-only-cache-size=<N> flag that enables the read-only cache for read-only accessors in FPGA.
  • Deprecated the support for the hls_float data type and replaced it with ap_float data type for FPGA.
  • Added support for open source runtime environment for FPGA.
  • Added support for fast BSP customization flow for FPGA.
  • Added support for Microsoft Visual Studio* 2022.
  • The Intel-specific header aligned_new is no longer included, as the functionality has been superseded by the C++17 aligned operator new feature. The functionality previously provided by aligned_new is now present in new, and should be usable without any other changes besides altering the preprocessor include.

Bug Fixes

  • Fixed an issue where dpcpp compiler was generating a temporary source file which is used during host compilation which appears as a source dependancy potentially breaking build environments which closely keep track of files generated during a compilation. 
  • Fixed an issue where sycl::link API could fail to JIT-compile user code if input kernel bundle/s contain more than one device image within them and specialization constants are used.
  • When compiling for FPGA, if you declare kernel names locally, the kernel name is correctly demangled in FPGA optimization reports. 
  • Fixed an FPGA emulator issue where the compiler would fail if you had also installed a oneAPI-specific GPU platform. 

Known Issues and Limitations

  • Latest GPU driver available at introduces an Ahead-Of-Time (AOT) build issue for OpenMP offload applications running on Gen9 iGPU when using oneAPI compilers. A fix for this issue will be available in the upcoming driver release. 
    For assistance with downgrading to a version of driver which does not have this issue, contact us via Graphics - Intel Communities.
  • GPU offload applications using extensive multi-threading (>2 threads) may experience hangs or time out which can be recovered only though a hard reset or power cycling of the system for the following Linux Distributions. The issue occurs when reading/writing data to the Intel GPU while making extensive use of multi-threading due to a defect in older Linux kernels. 
    Kernel/distribution Problem occurs Problem does not occur
    RedHat Enterprise Linux RHEL 8.4 (kernel 4.18.0-305) and older RHEL 8.5 (kernel 4.18.0-348)
    SUSE Linux SLES15 SP3 and older SLES15 SP4 beta
    Ubuntu Linux Ubuntu releases older than 20.04.03 Ubuntu 20.04.03 (kernel 5.11.0-40-generic #44~20.04.2-ubuntu)*

    Preferred Workaround: Upgrade to a Linux distribution where the defect has been fixed. Note that the software will run, but a warning message will appear in kernel logs.
    GPU software for Ubuntu 20.04.03 is available now via Note that the software will run, but a warning message will appear in kernel logs.
    GPU software for RHEL 8.5. will be available in Q1 2022 at the same location.
    GPU software for SLES15 SP4 will be available shortly after general availability of SLES15 SP4.

    Alternative Workaround: Do not use extensive multi-threading in GPU-enabled applications, i.e. keep the number of threads no more than 2. For example, for applications using the oneAPI MPI library, use the single threaded version of the MPI run-time library, rather than the multi-threaded version. Set the environment variable I_MPI_THREAD_SPLIT=0 to use the single threaded version of MPI.
  • The OpenMP default loop schedule modifier for work sharing loop constructs was changed to nonmonotonic when the schedule kind is dynamic or guided to conform to the OpenMP 5.0 standard. User code that assumes monotonic behavior may not work correctly with this change. Users can add the monotonic schedule modifier in the schedule clause to keep the previous code behavior.
  • Performance degration is expected with SYCL 2020 barriers compared to barriers in SYCL 1.2.1. The issue is currently under investigation is expected to be fixed in a future release.
  • When using a two-step Ahead of Time (AOT) compilation with at least a single call to devicelib function from within the kernel, the device binary image may get corrupted. 
  • Alignment of allocation requests is limited at 64KB due to limited support by Level Zero Runtime. 
  • SYCL 2020 Specialization constants feature has the following  limitations:
    • Building a program, which uses specialization constants for both JIT and AOT targets at the same time could result in an exception thrown with the following message: Native API failed. Native API returns: -49 (CL_INVALID_ARG_INDEX) -49 (CL_INVALID_ARG_INDEX).
    • Setting specialization constant value to zero is ignored by DPC++ runtime in the non-AOT scenario, i.e. when -fsycl-targets command line option is not passed or when spir64 is the target. Following is an example code demostrating the issue. There is currently no workaround.
      specialization_id<int> spec_id(42);
      // ...
      queue q;
      q.submit(handler &cgh) {
        // spec_id will still have value 42
        // spec_id value will be changed to 41
        // spec_id will still have value 41


    • In AOT mode, setting default values on padded objects can cause misalignment in other default values. This may cause specialization constants to have the wrong default values. For example:
      struct PaddedStruct {
        uint32_t a;
        char b;
        constexpr PaddedStruct() : a(0), b('a') {}
        constexpr PaddedStruct(uint32_t _a, char _b) : a(_a), b(_b) {}
      constexpr specialization_id<PaddedStruct> padded_struct_spec_id{20, 'c'};
      constexpr specialization_id<bool> bool_spec_id{true};

      In this, PaddedStruct has a size of 8 bytes, 3 of which are padding. This can cause the specialization constant identified by bool_spec_id not to have default value of true. A known workaround to this issue is to remove the padding from a padded object by adding __attribute__((packed)) to class or struct, i.e PaddedStruct becomes:

      struct __attribute__((packed)) PaddedStruct {
        uint32_t a;
        char b;
        constexpr PaddedStruct() : a(0), b('a') {}
        constexpr PaddedStruct(uint32_t _a, char _b) : a(_a), b(_b) {}


  • Usage of compiler option -Qlong-double on Windows* has limitations when using with latest Microsoft Visual Studio* releases, detailed information available here.
  • Error of undefined reference to sinpif and cospif functions such as Compilation from IR - skipping loading of FCL error: undefined reference to `sinpif' without them being used in application code is caused by a compiler optimization phase. Workaround is to use compiler flags -mllvm -enable-transform-sin-cos=0 which disables the faulty optimization.
  • Using #pragma omp declare simd on a member template is currently not supported and can lead to the error "error: function declaration is expected after 'declare simd' directive`. Non-template member functions and template function which are not a member of a class are not affected. 
  • Using Microsoft Visual Studio* as a host compiler for DPC++ with C++17 enabled causes the error C:\Program Files (x86)\Intel\oneAPI\compiler\latest\windows\include\sycl\CL/sycl/ONEAPI/accessor_property_list.hpp(199): error C2686: cannot overload static and non-static member functions with the same parameter types. Refer to the article here on how to workaround this issue.
  • USM support for implicit migrations of shared-allocations between device and host is currently implemented in SW using access violation mechanisms (e.g. SIGSEV) to identify access from host. Undefined behavior may occur if applications rely on similar access-violation mechanisms, or they use system calls to access shared-memory allocations before being migrated to host by the GPU driver.
  • icx compiler does not support linking library archives using the -l option for libraries that contain target offload code. More details and workaround for this issue can be found at Known Issue: Static Libraries and Target Offload.
  • Attempt to use Link Time Optimization (LTO) is causing a linker failure. To successfully link, make sure you have the recommended versions of binutils for your OS listed at Intel® oneAPI DPC++/C++ Compiler and Intel® oneAPI DPC++ Library System Requirements
  • User-defined functions with the same name and signature (exact match of arguments, return type does not matter) as of an OpenCL C built-in function, can lead to Undefined Behavior. More details about this issue can be found at Known Issue: User-defined Functions with the Same Signature as OpenCL C built-in functions.
  • #pragma float_control that occurs at file scope are not correctly effective for statement blocks that are nested within class definitions. The same issue exists for #pragma clang fp.
  • When debugging FPGA emulator code in Microsoft Visual Studio* on a Windows* system, the debugger does not stop at breakpoints set in kernel code. There is no workaround available for this issue currently. 
  • When compiling for FPGA and using a read-only accessor for a very wide struct, the compile times can be large. As a workaround to address long compile times, use a read-write accessor instead. 
  • When compiling for FPGA, you cannot use a system installed with Intel® FPGA PAC D5005 to compile a SYCL application that targets Intel® PAC with Intel® Arria® 10 FX FPGA. Compilation may succeed, but the compiled binary might fail at runtime. There is no workaround available for this issue currently. 
  • When you perform FPGA compile and link stages with a single dpcpp command (for example, dpcpp -fintelfpga <other arguments> -Xshardware src/kernel.cpp), if the source code is not located in the current directory, you might observe that the source code browser is missing in the generated FPGA optimization reports. To work around this issue, compile and link the executable in separate stages, as follows: 

    dpcpp -fintelfpga <other arguments> -Xshardware -c src/kernel.cpp -o kernel.o
    dpcpp -fintelfpga <other arguments> -Xshardware -kernel.o
  • When compiling for FPGA, the debug support on Windows is not available when using device-side libraries. To avoid this issue, do not run a debugger on the emulator platform on Windows.

  • In the FPGA optimization report, the Loop Viewer (Alpha) can only handle loops with 100 iterations or less currently. For designs with loops greater than 100 iterations, the optimization reports hang. There is no known workaround for this issue.

  • The script is not supported for FPGA in this release. As a workaround, use the script.

  • FPGA optimization reports are not displayed correctly within Microsoft Visual Studio on Windows. To view the reports, open the report.html file generated in the project directory. 

  • On Windows, compiling FPGA designs in a directory with a long path name might fail and you might see the following error: 
    dpcpp: error: fpga compiler command failed with exit code 1 (use -v to see invocation)
    NMAKE : fatal error U1077: ‘…\oneAPI\compiler\latest\windows\bin\dpcpp.EXE' : return code '0x1'

    As a workaround, either compile the design in a directory with a short path name or reset TMP and TEMP environment variables to point to a shorter path (for example, C:\temp). 

  • When compiling for FPGA, the Windows emulator flow using -c to create object files, linking through to an archive file, and then generating an executable from that archive might result in an executable that fails to launch device kernels. As a workaround for this issue, add the -fsycl-device-code-split=none flag to the archive step as shown in the following:

    # generate .obj files
    dpcpp /EHsc -fintelfpga -c host.cpp device.cpp device_adder.cpp -DFPGA_EMULATOR
    # generate host.a
    dpcpp -fintelfpga -fsycl-link=image -fsycl-device-code-split=none host.obj device.obj device_adder.obj
    # generate .exe
    dpcpp -fintelfpga host.a /link /wholearchive
    # emulator executable
  • When using the atomic_fence function for FPGA, the memory_scope::system constraint is not supported. The broadest scope supported is the memory_scope::device constraint. There is no workaround available for this currently. 

  • When compiling for FPGA on a Linux system, you might see Unable to open zlib library! error message when the compiler is unable to detect the zlib library, which comes standard on most Linux OSes. As a workaround for the compiler to detect this library, install a development version of the library by executing one of the following OS-specific commands:

    • Ubuntu 18: sudo apt install zlib1g-dev

    • RHEL 7/CentOS 7: sudo yum install zlib-devel

  • When launching FPGA optimization reports, the compiler might fail to render certain text characters included in the source file. If the reports are crashing, verify whether the source file has any string literals that end in an escaped backslash in the fileJSON object’s content section within the report_data.js file under the reports/lib/ directory. As a workaround for this issue, modify the report_data.js file to escape the unescaped character. For example, change "hello\\" to "hello\\\". 

System Requirements

Additional Documentation

Previous oneAPI Releases

Notices and Disclaimers

Intel technologies may require enabled hardware, software, or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from a course of performance, course of dealing, or usage in trade.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at