Profiling an FPGA-driven SYCL* Application
Use this recipe to profile an FPGA-driven SYCL application. The recipe features the AOCL Profiler integrated in the CPU/FPGA Interaction (preview) analysis type in
Intel® VTune™
.
Profiler
Ingredients
Here are the minimum hardware and software requirements for this performance recipe.
- Application:crr. This sample FPGA design is available in the repository for Intel® oneAPI DPC++ Compiler samples.
- Compiler: To profile a SYCL application, you need thedpcppcompiler that is available with Intel® oneAPI toolkits.
- Tools:
- Intel® VTune™- CPU/FPGA Interaction (preview) AnalysisProfiler
- Starting with the 2020 release, Intel® VTune™ Amplifier has been renamed toIntel® VTune™.Profiler
- Most recipes in theIntel® VTune™Performance Analysis Cookbook are flexible. You can apply them to different versions ofProfilerIntel® VTune™. In some cases, minor adjustments may be required.Profiler
- Get the latest version ofIntel® VTune™:Profiler
- From theIntel® VTune™product page.Profiler
- Download the latest standalone package from the Intel® oneAPI standalone components page.
- Operating system: Linux* OS (Ubuntu* 18.04)
- CPU: Intel server platform code-named Cascade Lake
- FPGA: Intel® Programmable Acceleration Card (Intel® PAC) with Intel® Arria® 10 GX FPGA or Intel® Stratix 10 GX FPGA PAC board for SYCL (with installable add-on)
Install and Configure the Toolkit
- Plug the Intel PAC card into the PCIe slot on the machine.
- Download and install Intel® oneAPI Base Toolkit for Linux. Select all default options and either the online or offline installer.
- Unzip the FPGA add-on package and runsetup.sh. Select all default options.
- Set up the oneAPI environment.source <oneAPI-install-dir>/setvars.sh
- Install the FPGA board.aocl install
- Run the diagnose command to ensure that all diagnostics pass.aocl diagnose
Build the Sample Application
- Download code samples from the repository for Intel oneAPI DPC++ Compiler samples.git clone https://github.com/intel/BaseKit-code-samples.git
- Open thecrrsample folder.cd BaseKit-code-samples/FPGAExampleDesigns/crr
- Open thesrc/CMakeLists.txtfile.
- Locate the line of code that lists hardware flags. It should start withset(HARDWARE_LINK_FLAGS.
- Add-Xsprofileto the set of flags.
- Go back to the main directory for the sample. Create a new folder calledbuildand open it.mkdir build cd build
- Compile the sample.
This process can take several hours. Once it has finished, you should have an executable file calledcmake .. make fpgacrr.fpga.
You can now run
crr.fpga
on FPGA hardware.
Run CPU/FPGA Interaction Analysis
- OpenIntel® VTune™and clickProfilerNew Projecton the Welcome screen.TheCreate a Projectdialog box opens.
- Specify a project name, a location for your project, and clickCreate Project.TheConfigure Analysiswindow opens.
- In theWHEREpane, selectLocal Host.
- In theWHATpane, selectLaunch Applicationas the target.
- In theApplicationfield, specify the path to thecrr.fpgaexecutable.
- In theApplication parametersfield, enterordered_inputs.csv.
- In theHOWpane, selectCPU/FPGA Interaction (preview)from thePlatform Analysisgroup.
- In the analysis settings, selectAOCL Profilerfor theFPGA profiling data source.
- ClickStartat the bottom to run the analysis.
Analyze Results
Once data collection completes, you can see the finalized results in the

CPU/FPGA Interaction
viewpoint. Start with the
Summary
window to view these details:
- FPGA top compute tasks
- Top tasks and hotspots for the CPU

Switch to the
Bottom-up
window to see detailed information at the kernel level including:
- Stalls
- Occupancy
- Data transfer size
- Average bandwidth for transferred data

Use the timeline view to see these details about kernel instances:
- Start/end times
- Overtime stalls
- Occupancy
- Bandwidth metrics

In the
Bottom-up
window, right-click on a kernel and select
View Source
from context menu.
This opens the
Source View
, where you can see metrics for specific kernel source lines.
