Intel® Advisor User Guide

ID 766448
Date 12/16/2022

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Analyze GPU Roofline

Measure and visualize the actual performance of GPU kernels using benchmarks and hardware metric profiling against hardware-imposed performance ceilings, as well as determine the main limiting factor, by running the GPU Roofline Insights perspective.

Use the Roofline chart to answer the following questions:

  • What is the maximum achievable performance with your current hardware resources?

  • Does your application work optimally on current hardware resources?

  • If not, what are the best candidates for optimization?

  • Is memory bandwidth or compute capacity limiting performance for each optimization candidate?

Run the GPU Roofline Insights to measure performance of SYCL, C++/Fortran with OpenMP* pragmas, Intel® oneAPI Level Zero (Level Zero), or OpenCL™ applications enabled to run on a GPU.

How It Works

The GPU Roofline Insights perspective includes the following steps:

  1. Collect OpenCL™ kernels timings and memory data using the Survey analysis with GPU profiling.
  2. Measure the hardware limitations and collect floating-point and integer operations data using the Characterization analysis with GPU profiling.

    Intel® Advisor calculates compute op