Model GPU Application Performance for a Different GPU Device
This recipe illustrates how to estimate application performance from one Intel® graphics processing unit (GPU) architecture to another by running the
Offload Modeling
perspective from the
Intel® Advisor
.
The performance estimation plays an important role in determining the next steps for the future-generation GPU architectures. For such cases, the GPU-to-GPU modeling is more accurate than the CPU-to-GPU modeling because of inherent differences between CPU and GPU execution flows.
In this recipe, use the
Intel Advisor
to analyze performance a SYCL application with the GPU-to-GPU modeling flow of the
Offload Modeling
perspective to estimate the profitability of offloading the application to the Intel® Iris® Xe
MAX graphics (gen12_dg1
configuration).
Directions:
Ingredients
This section lists the hardware and software used to produce the specific result shown in this recipe:
- Performance analysis tools:Intel Advisor2021Available for download as a standalone and as part of the Intel® oneAPI Base Toolkit.
- Application: SYCL implementation of the Mandelbrot sample application, which is part of oneAPI samples
- Compiler: Intel® oneAPI DPC++/C++ Compiler 2021Available for download as part of the Intel® oneAPI Base Toolkit.
- Operating system: Ubuntu* 20.04
- Baseline GPU: Intel® Iris® Plus Graphics 655
You can
download a precollected Offload Modeling report for the Mandelbrot application to follow this recipe and examine the analysis results.
Prerequisites
- Set up environment variables for oneAPI tools:source <oneapi-install-dir>/setvars.sh
- Configure your system to analyze GPU kernels.
- Build the SYCL version of the Mandelbrot application:cd mandelbrot/ && mkdir build && cd build && cmake .. && make
Run GPU-to-GPU Performance Modeling
You can run the GPU-to-GPU modeling using
Intel Advisor
command line interface (CLI), Python* scripts, or
Intel Advisor
graphical user interface (GUI).
In this section, use a special command line collection preset for the
Offload Modeling
perspective with the
--gpu
option to run all perspective analyses for the GPU-to-GPU modeling with a single command:
advisor --collect=offload --project-dir=./mandelbrot-advisor --gpu --config=gen12_dg1 -- ./mandelbrot
You can change a target GPU for modeling by providing a different value to the
--config
option. See
config for details and a full list of options.
This command runs the perspective with the default
medium
accuracy and runs the following analyses one-by-one:
- Survey analysis to collect baseline performance data
- Characterization analysis to collect trip counts and FLOP and model data transfers
- Performance Modeling from the baseline Intel® UHD Graphics P630 device to the target Intel® Iris® XeMAX Graphics
Important
: The command line collection preset does not support MPI applications. You will need to run the analyses separately to
analyze MPI application.
Once the analyses are completed, the result summary is printed to the terminal. You can continue to view the results in the
Intel Advisor
GUI or in an interactive HTML report from your preferred web browser.
Examine Performance Speedup on the Target GPU
In this section, examine the HTML report to understand the GPU-to-GPU modeling results. The HTML report is generated automatically after you run the
Offload Modeling
from CLI or using the Python scripts and is saved to
./mandelbrot-advisor/e000/report/advisor-report.html
. You can open the report in your preferred web browser.
In this interactive HTML report, you can switch between
Offload Modeling
and GPU Roofline Insights perspective results using the drop-down in the top left.
In the Summary tab, examine the
Top Metrics
and
Program Metrics
panes to understand the performance gain.
- TheTop Metricspane shows an average speed up of 5.311x from offloading one code region of the Mandelbrot application from the baseline Intel® Iris® Plus Graphics 655 GPU device to the target Intel® Iris® XeMAX Graphics GPU device.
- TheProgram Metricsshows measured execution time for the current run on the baseline GPU and an estimated time for the run on the target GPU.
You can navigate between
Summary
,
Accelerated Regions
, and
Source View
tabs to understand details about the offloaded regions, examine useful metrics and the potential performance gain.
The
Accelerated Regions
tab provides detailed information for the offloaded code regions along with the source code in the bottom pane. In this view, you can examine different useful metrics for offloaded regions of interest. For example, examine the following metrics
measured
for the kernels running on the baseline GPU: iteration space, thread occupancy, SIMD width, local size, global size.
Examine the following metrics
estimated
for the target GPU: performance issues, time, speedup, data transfer with reuse.
See
Accelerator Metrics for detailed description and interpretation of these metrics.

Alternative Steps
You can run the GPU-to-GPU modeling using
Intel Advisor
command line interface (CLI), Python* scripts, or
Intel Advisor
GUI.
Run
Intel Advisor
Python Scripts (Instead of
Offload Modeling
Collection Preset)Use the special Python scripts delivered with the
Intel Advisor
to run the GPU-to-GPU modeling. These scripts use the
Intel Advisor
Python API to run the analyses.
For example, run the
run_oa.py
script with the
--gpu
to execute the perspective using a single command as follows:
$ advisor-python $APM/run_oa.py ./mandelbrot-advisor --collect=basic --gpu --config=gen12_dg1 -- ./mandlebrot
You can change a target GPU for modeling by providing a different value to the
--config
option. See
config for a full list of options.
The
run_oa.py
script runs the following analyses one-by-one:
- Survey analysis to collect baseline performance data
- Characterization analysis to collect trip counts and FLOP and model data transfers
- Performance Modeling from the baseline Intel® UHD Graphics P630 device to the target Intel® Iris® XeMAX Graphics
Important
: The command line collection preset does not support MPI applications. Use the
Intel Advisor
CLI to
analyze MPI application.
Once the analyses are completed, the result summary is printed to the terminal. You can continue to view the results in the
Intel Advisor
GUI or in an interactive HTML report from your preferred web browser.
Run
Intel Advisor
GUI (Instead of
Offload Modeling
Collection Preset)Prerequisite
:
Create a project for the Mandelbrot application.
To run GPU-to-GPU modeling from
Intel Advisor
GUI:
- From thePerspective Selectorwindow, select theOffload Modelingperspective.
- In theAnalysis Workflowpane, select the following:
- SelectGPUfrom theBaseline Devicedrop-down.
- SelectXe LP Maxfrom theTarget Platform Modeldrop-down.
- Run the perspective.
Once the perspective is completed, the GPU-to-GPU offload modeling result is shown in the pane on the right.
Key Take-Aways
With the GPU-to-GPU modeling, you can get more accurate projections for your application performance on the next-generation GPUs even before you have the hardware. The metrics collected by
Offload Modeling
can help you understand performance of the kernels running on the baseline GPU. The new interactive HTML report gives GUI-like experience and allows you to switch between
Offload Modeling
and GPU Roofline Insights perspectives, almost as in GUI.