Run
GPU Roofline Insights Perspective from Command Line
GPU Roofline Insights
Perspective from Command Line To plot a Roofline chart, the
Intel® Advisor
runs two steps:
- Collect OpenCL™ kernels timings and memory data using the Survey analysis with GPU profiling.
- Measure the hardware limitations and collect floating-point and integer operations data using the Characterization analysis with GPU profiling.Intel® Advisorcalculates compute operations (FLOP and INTOP) as a weighted sum of the following groups of instructions: BASIC COMPUTE, FMA, BIT, DIV, POW, MATH.Intel Advisorautomatically determines data type in the collected operations using thedstregister.
For convenience,
Intel Advisor
has the shortcut
--collect=roofline
command line action, which you can use to run both Survey and Characterization analyses with a single command. This shortcut command is recommended to run the
GPU Roofline Insights
perspective.
See
Intel Advisor
cheat sheet for quick reference on command line interface.
Prerequisites
- Configure your system to analyze GPU kernels.
- SetIntel Advisorenvironment variables with an automated script to enable theadvisorcommand line interface (CLI).
Run the
GPU Roofline Insights Perspective
GPU Roofline Insights
Perspective There are two methods to run the GPU Roofline analysis. Use
one
of the following:
- Run the shortcut--collect=rooflinecommand line action to execute the Survey and Characterization analyses for GPU kernels with a single command. This method is recommended to run theCPU / Memory Roofline Insightsperspective, but it does not support MPI applications.
- Run the Survey and Characterization analyses for GPU kernels with the--collect=surveyand--collect=tripcountscommand actions separately one by one. This method is recommended if you want to analyze an MPI application.
Optionally, you can also run the Performance Modeling analysis as part of the
GPU Roofline Insights
perspective. If you select this analysis, it models your application performance on a baseline GPU device as a target to compare it with the actual application performance. This data is used to suggest more recommendations for performance optimization.
Note
: In the commands below, make sure to replace the
myApplication
with your application executable path and name
before
executing a command. If your application requires additional command line options, add them
after
the executable name.Method 1. Run the Shortcut Command
- Collect data for a GPU Roofline chart with a shortcut.advisor --collect=roofline --profile-gpu --project-dir=./advi_results -- ./myApplicationThis command collects data both for GPU kernels and CPU loops/functions in your application. For kernels running on GPU, it generates a Memory-Level Roofline.
- Run Performance Modeling for the GPU that the application runs on.advisor --collect=projection --profile-gpu --model-baseline-gpu --project-dir=./advi_resultsMake sure to use the--model-baseline-gpuoption for Performance Modeling to work correctly.This command models your application potential performance on a baseline GPU as a target to determine additional optimization recommendations.
Method 2. Run the Analyses Separately
Use this method if you want to analyze an MPI application.
- Run the Survey analysis.advisor --collect=survey --profile-gpu --project-dir=./advi_results -- ./myApplication
- Run the Characterization analysis to collect trip counts and FLOP data:advisor --collect=tripcounts --flop --profile-gpu --project-dir=./advi_results -- ./myApplicationThese commands collect data both for GPU kernels and CPU loops/functions in your application. For kernels running on GPU, it generates a Memory-Level Roofline.
- Run Performance Modeling for the GPU that the application runs on.advisor --collect=projection --profile-gpu --model-baseline-gpu --project-dir=./advi_resultsMake sure to use the--model-baseline-gpuoption for Performance Modeling to work correctly.This command models your application potential performance on a baseline GPU as a target to determine additional optimization recommendations.
You can view the results in the Intel Advisor graphical user interface (GUI) or in CLI, or generate an interactive HTML report. See View the Results below for details.
Analysis Details
The
CPU / Memory Roofline Insights
workflow includes only the Roofline analysis, which sequentially runs the Survey and Characterization (trip counts and FLOP) analyses.
The analysis has a set of additional options that modify its behavior and collect additional performance data.
Consider the following options:
Roofline Options
To run the Roofline analysis, use the following command line action:
--collect=roofline
.
You can also use these options with
--collect=survey
and
--collect=tripcounts
if you want to run the analyses separately.
Recommended action options:
Options | Description |
---|---|
--profile-gpu | Analyze GPU kernels. This option is
required for each command.
|
--target-gpu | Select a target GPU adapter to collect profiling data. The adapter configuration should be in the following format
<domain> :<bus> :<device-number> .<function-number> . Only decimal numbers are accepted. Use this option if you have more than one GPU adapter on your system. The default is the latest GPU architecture version found on your system.
To see a list of GPU adapters available on your system, run
advisor --help target-gpu and see the option description.
|
--gpu-sampling-interval= <double> | Set an interval (in milliseconds) between GPU samples. By default, it is set to
1 .
|
--enable-data-transfer-analysis | Model data transfer between host memory and device memory. Use this option if you want to run the Performance Modeling analysis.
|
--track-memory-objects | Attribute memory objects to the analyzed loops that accessed the objects. Use this option if you want to run the Performance Modeling analysis.
|
--data-transfer= <level> | Set the level of details for modeling data transfers during Characterization. Use this option if you want to run the Performance Modeling analysis.
Use one of the following values:
|
See
advisor Command Option Reference for more options.
Performance Modeling Options
To run the Performance Modeling analysis, use the following command line action:
--collect=projection
.
The action options in the table below are
required
to use when you run the Performance Modeling analysis as part of the
GPU Roofline Insights
perspective:
Options | Description |
---|---|
--profile-gpu | Analyze GPU kernels. This option is
required for each command.
|
--enforce-baseline-decomposition | Use the same local size and SIMD width as measured on the baseline. This option is
required .
|
--model-baseline-gpu | Use the baseline GPU configuration as a target device for modeling. This option is
required .
This option automatically enables the
--enforce-baseline-decomposition option, so you can use only
--model-baseline-gpu .
|
See
advisor Command Option Reference for more options.
Next Steps
Continue to
explore GPU Roofline results. For details about the metrics reported, see
Accelerator Metrics.