Example: Profile a DPC++ Application on Linux*
Use
VTune
with a sample
Profiler
matrix_multiply
DPC++ (Data Parallel C++) application to quickly get familiar with the product and statistics collected for GPU-bound applications.
Prerequisites
- InstallVTuneandProfilerIntel® oneAPIfrom theDPC++/C++CompilerIntel® oneAPI Base Toolkitor theIntel® System Bring-up Toolkit.
- Set up environment variables by executing thevars.shscript.
Build the Matrix Application
Download the
matrix_multiply_vtune
code sample package for Intel oneAPI toolkits. This contains the sample which you can use to build and profile a DPC++ application.
To profile a DPC++ application, make sure to compile the code using the
-gline-tables-only
and
-fdebug-info-for-profiling
Intel oneAPI DPC++ Compiler options.
To compile this sample application, do the following:
- Go to the sample directory.cd <sample_dir/VtuneProfiler/matrix_multiply>
- Themultiply.cppfile in thesrcfolder contains several DPC++ versions of matrix multiplication. Select a version by editing the corresponding#define MULTIPLYline inmultiply.h.
- Build the app using the existing Makefile:cmake . makeThis should generate amatrix.dpcppexecutable.To delete the program, type:make cleanThis removes the executable and object files that were created by themakecommand.
Run GPU Analysis
Run a GPU analysis on the Matrix sample.
- Launch VTune Profiler with thevtune-guicommand.
- ClickNew Projectfrom the Welcome page.
- Specify a name and location for your sample project and clickCreate Project.
- In theWHATpane, browse to thematrix.dpcppfile.
- In theHOWpane, click the
Browse button and select
GPU Compute/Media Hotspotsanalysis from theAcceleratorsgroup in the Analysis Tree. - Click theStartbutton at the bottom to launch the analysis with the pre-selected options.
VTune
collects data and displays analysis results in the
Profiler
GPU Compute/Media Hotspots
viewpoint. In the
Summary
window, see statistics on CPU and GPU resource usage to understand if your application is GPU-bound. Switch to the
Graphics
window to see basic CPU and GPU metrics representing code execution over time.