Programming Guide

Contents

Fast Recompile for FPGA

The Intel® oneAPI DPC++/C++ Compiler supports only the ahead-of-time (AoT) compilation for FPGA hardware, which means that an FPGA device image is generated at compile time. The FPGA device image generation process can take hours to complete. If you make a change exclusive to the host code, then recompile only your host code by reusing the existing FPGA device image and circumventing the time-consuming device compilation process.
The Intel® oneAPI DPC++/C++ Compiler provides the following mechanisms to separate device code and host code compilation:
  • Passing the
    -reuse-exe=<exe_name>
    flag to instruct the compiler to attempt to reuse the existing FPGA device image.
  • Separating the host and device code into separate files. When a code change applies only to host-only files, the FPGA device image is not regenerated.
  • Separating the device code using the compiler option
    -fsycl-device-code-split
    .
The following sections explain these two mechanisms in detail.

Using the
-reuse-exe
Flag

If the device code and options affecting the device have not changed since the previous compilation, passing the
-reuse-exe=<exe_name>
flag instructs the compiler to extract the compiled FPGA hardware image from the existing executable and package it into the new executable, saving the device compilation time.
Sample use:
// Initial compilation dpcpp -fintelfpga -Xshardware <files.cpp> -o out.fpga
The initial compilation generates an FPGA device image, which takes several hours. Suppose you now make some changes to the host code.
// Subsequent recompilation dpcpp <files.cpp> -o out.fpga -reuse-exe=out.fpga -Xshardware -fintelfpga
One of the following actions are taken by the command:
  • If the
    out.fpga
    file does not exist, the
    -reuse-exe
    flag is ignored, and the FPGA device image is regenerated. This is always the case the first time you compile a project.
  • If the
    out.fpga
    file is found, the compiler verifies no change that affects the FPGA device code is made since the last compilation. If no change is detected in the device code, the compiler then reuses the existing FPGA device image and recompiles only the host code. The recompilation process takes a few minutes to complete.
  • If the
    out.fpga
    file is found, but the compiler cannot prove that the FPGA device code will yield a result identical to the last compilation, a warning is printed, and the FPGA device code is fully recompiled. Since the compiler checks must be conservative, spurious recompilations can sometimes occur when using the
    -reuse-exe
    flag.

Using the
-fsycl-device-code-split[=value]
Option

When you use the
-fsycl-device-code-split[=value]
option, the compiler compiles each split partition as if targeting its own device. This option supports the following modes:
  • auto
    : This is the default mode and the same as the
    -fsycl-device-code-split
    option without any value. The compiler uses a heuristic to select the best way of splitting device code.
  • off
    : Creates a single module for all kernels.
  • per_kernel
    : Creates a separate device code module for each kernel. Each device code module contains a kernel and dependencies, such as called functions and user variables.
  • per_source
    : Creates a separate device code module for each source (translation unit). Each device code module contains a bunch of kernels grouped on a per-source basis and all their dependencies, such as all used variables and called functions, including the
    SYCL_EXTERNAL
    macro-marked functions from other translation units.
For FPGA, each split must not share device resources, such as memory, across it. Furthermore, kernel pipes must have their source and sink within the same split.
For additional information about this option, refer to the fsycl-device-code-split topic in
Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference
.

Which Mechanism to Use?

Of the mechanisms described above, the
-reuse-exe
flag mechanism is easier to use than the device link mechanism. The flag also allows you to keep your host and device code as a single source, which is preferred for small programs. For larger and more complex projects, the device link method gives you more control over the compiler’s behavior.
However, there are some drawbacks of the
-reuse-exe
flag when compared to compiling separate files. Consider the following when using the
-reuse-exe
flag:
  • The compiler must spend time partially recompiling and then analyzing the device code to ensure that it is unchanged. This takes several minutes for larger designs. Compiling separate files does not incur this extra time.
  • You might occasionally encounter a false positive where the compiler incorrectly believes it must recompile your device code. In a single source file, the device and host code are coupled, so certain changes to the host code can change the compiler’s view of the device code. The compiler always behaves conservatively and triggers a full recompilation if it cannot prove that reusing the previous FPGA binary is safe. Compiling separate files eliminates this possibility.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.