Visible to Intel only — GUID: GUID-8378BA9F-F799-49D3-8E9C-17FA94D2F4E3
Types of SYCL* FPGA Compilation
SYCL supports accelerators in general. The Intel® oneAPI DPC++/C++ Compiler implements additional FPGA-specific support to assist FPGA code development. This topic highlights different FPGA compilation flows that the Intel oneAPI Base Toolkit supports.
For a hands-on lesson in the types of FPGA compilation, review the FPGA Compile Sample on GitHub.
The following table summarizes the types of FPGA compilation:
Device Image Type |
Time to Compile |
Description |
FPGA Emulator |
Seconds |
Compiles the FPGA device code to the CPU. Use the Intel® FPGA Emulation Platform for OpenCL™ software to verify your SYCL code’s functional correctness. |
FPGA Optimization Report |
Minutes |
Partially compiles the FPGA device code for hardware to generate an optimization report that describes the structures generated on the FPGA, identifies performance bottlenecks, and estimates resource utilization. When your compilation targets an FPGA device family or part number, this stage also give you RTL files for the IP component in your code. You can then use Intel® Quartus® Prime software to integrate your IP components into a larger design. |
FPGA Simulator |
Minutes |
Compiles the FPGA device code to the CPU. Use the Questa*-Intel® FPGA Edition simulator to debug your code. |
FPGA Hardware Image |
Hours |
When your compilation targets an FPGA acceleration board, this stage generates the real FPGA bitstream to execute on the target FPGA platform. When your compilation targets an FPGA device family or part number, this stage also gives you RTL files for the IP component in your code. You can then use Intel® Quartus® Prime software to integrate your IP components into a larger design. |
A typical FPGA development workflow is to iterate in the emulation, optimization report, and simulation stages, refining your code using the feedback provided by each stage. Intel® recommends relying on emulation and the FPGA optimization report whenever possible.
An FPGA hardware compile requires installing the Intel® Quartus® Prime software separately. Targeting a board also requires that you install the BSP for the board.
For more information, refer to the Intel® oneAPI Toolkits Installation Guide and Intel® FPGA development flow webpage.
Also, generating RTL code for an IP component requires only the Intel® oneAPI DPC++/C++ Compiler that is part of the Intel® oneAPI Base Toolkit. However, for simulating or integrating that IP component into your hardware design requires installing the Intel® Quartus® Prime Pro Edition software.
FPGA Emulator
The FPGA emulator (Intel® FPGA Emulation Platform for OpenCL™ software) is the fastest method to verify the correctness of your code. It executes the SYCL device code on the CPU. The emulator is similar to the SYCL host device, but unlike the host device, the FPGA emulator device supports FPGA extensions such as FPGA pipes and fpga_reg. For more information, refer to Pipes Extension and Kernel Variables topics in the Intel oneAPI FPGA Handbook.
The following are some important caveats to remember when using the FPGA emulator:
Performance is not representative.
Never draw inferences about FPGA performance from the FPGA emulator. The FPGA emulator’s timing behavior is not correlated to that of the physical FPGA hardware. For example, an optimization that yields a 100x performance improvement on the FPGA may not impact the emulator performance. The emulator might show an unrelated increase or decrease.
Undefined behavior may differ.
If your code produces different results when compiled for the FPGA emulator versus FPGA hardware, your code most likely exercises undefined behavior. By definition, undefined behavior is not specified by the language specification and might manifest differently on different targets.
For detailed information about emulation your kernels, refer to Emulate Your Kernel in the Intel oneAPI FPGA Handbook.
FPGA Optimization Report
The FPGA Optizmiation Report is generated in the following compilation stages:
Stages |
Description |
Optimization Report Information |
---|---|---|
FPGA Optimization Report image (Compilation takes minutes to complete) |
The SYCL device code is optimized and converted into an FPGA design specified in the Verilog Register-Transfer Level (RTL) (a low-level, native entry language for FPGAs). The intermediate compilation result is the FPGA early device image that is not an executable. The optimization report generated at this stage is static in nature. |
Contains important information about how the compiler has transformed your SYCL device code into an FPGA design. The report includes the following information:
For information about the FPGA optimization report, refer to the Review the FPGA Optimization Report in the Intel® oneAPI FPGA Handbook. |
FPGA hardware image (Compilation takes hours to complete) |
The Verilog RTL specifying the design’s circuit topology is mapped onto the FPGA’s primitive hardware resources by the Intel® Quartus® Prime pro Edition Software. The result is an FPGA hardware binary (also referred to as a bitstream). |
Contains precise information about resource utilization and fMAX numbers. For detailed information about how to analyze reports, refer to Analyze your Design in the Intel® oneAPI FPGA Handbook. For information about the FPGA hardware image, refer to the Intel® oneAPI FPGA Handbook. |
When your compilation targets an FPGA device or part number, this stage gives you RTL files for the IP component in your code. You can then use Intel® Quartus® Prime software to integrate your IP components into a larger design.
FPGA Simulator
The simulation flow allows you to use the Questa*-Intel® FPGA Edition simulator software to simulate the exact behavior of the synthesized kernel. Like emulation, you can run simulation on a system that does not have a target FPGA board installed. The simulator models a kernel much more accurately than the emulator, but it is much slower than the emulator.
The simulation flow is cycle-accurate and bit-accurate. It exactly models the behavior of a kernel’s datapath and the results of operations on floating-point data types. However, simulation cannot accurately model variable-latency memories or other external interfaces. Intel recommends that you simulate your design with a small input dataset because simulation is much slower than running on FPGA hardware or emulator.
You can use the simulation flow in conjunction with profiling to collect additional information about your design. For more information about profiling, refer to Intel® FPGA Dynamic Profiler for DPC++ in the Intel® oneAPI FPGA Handbook.
For more information about the simulation flow, refer to Evaluate Your Kernel Through Simulation in the Intel® oneAPI FPGA Handbook.
FPGA Hardware
An FPGA hardware compile requires the Intel® Quartus® Prime software (installed separately). This is a full compilation stage through to the FPGA hardware image where you can target one of the following:
Intel® FPGA device family
Specific Intel® FPGA device part number
Custom board with a supported BSP
Intel® Programmable Acceleration Card (PAC) (deprecated)
For more information about the targets, refer to the Intel® oneAPI DPC++/C++ Compiler System Requirements. For more information about using Intel® PAC or custom boards, refer to FPGA BSPs and Boards in the Intel® oneAPI FPGA Handbook and the Intel® oneAPI Toolkits Installation Guide for Linux* OS Installation Guide.