• 2021
  • 11/09/2021
  • Public Content

Profiling an FPGA-driven DPC++ Application

Use this recipe to profile an FPGA-driven DPC++ (Data Parallel C++) application. The recipe features the AOCL Profiler integrated in the CPU/FPGA Interaction (preview) analysis type in Intel® VTune™ Profiler.


Here are the minimum hardware and software requirements for this performance recipe.

Install and Configure the Toolkit

  1. Plug the Intel PAC card into the PCIe slot on the machine.
  2. Download and install Intel® oneAPI Base Toolkit for Linux. Select all default options and either the online or offline installer.
  3. Unzip the FPGA add-on package and run
    . Select all default options.
  4. Set up the oneAPI environment.
    source <oneAPI-install-dir>/
  5. Install the FPGA board.
    aocl install
  6. Run the diagnose command to ensure that all diagnostics pass.
    aocl diagnose

Build the Sample Application

  1. Download code samples from the repository for Intel oneAPI DPC++ Compiler samples.
    git clone
  2. Open the
    sample folder.
    cd BaseKit-code-samples/FPGAExampleDesigns/crr
  3. Open the
  4. Locate the line of code that lists hardware flags. It should start with
  5. Add
    to the set of flags.
  6. Go back to the main directory for the sample. Create a new folder called
    and open it.
    mkdir build cd build
  7. Compile the sample.
    cmake .. make fpga
    This process can take several hours. Once it has finished, you should have an executable file called
You can now run
on FPGA hardware.

Run CPU/FPGA Interaction Analysis

  1. Launch VTune Profiler and click
    New Project
    from the Welcome page.
    Create a Project
    dialog box opens.
  2. Specify a project name, a location for your project, and click
    Create Project
    Configure Analysis
    window opens.
  3. In the
    pane, select
    Local Host
  4. In the
    pane, select
    Launch Application
    as the target.
    • In the
      field, specify the path to the
    • In the
      Application parameters
      field, enter
    Set up FPGA analysis
  5. In the
    pane, select
    CPU/FPGA Interaction (preview)
    from the
    Platform Analysis
  6. In the analysis settings, select
    AOCL Profiler
    for the
    FPGA profiling data source
    Set up FPGA analysis
  7. Click
    at the bottom to run the analysis.

Analyze Results

Once data collection completes, you can see the finalized results in the
CPU/FPGA Interaction
viewpoint. Start with the
window to view these details:
  • FPGA top compute tasks
  • Top tasks and hotspots for the CPU
Result summary for CPU/FPGA Interaction
Switch to the
window to see detailed information at the kernel level including:
  • Stalls
  • Occupancy
  • Data transfer size
  • Average bandwidth for transferred data
Bottom-up window
Use the timeline view to see these details about kernel instances:
  • Start/end times
  • Overtime stalls
  • Occupancy
  • Bandwidth metrics
Timeline view in CPU/FPGA Interaction
In the
window, right-click on a kernel and select
View Source
from context menu.
This opens the
Source View
, where you can see metrics for specific kernel source lines.
Source View

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at