Explore GPU Roofline Results
View Results in GUI
- Program metrics for all code regions executed on the GPU and loops/functions executed on the CPU, including total execution time, GPU usage effectiveness, and the number of executed operations.
- Preview Roofline charts for CPU and GPU parts of your code. The charts plot an application's achieved performance and arithmetic intensity against the maximum achievable performance for top three dots and total dot, which combines all loops/functions (for CPU) and kernels (for GPU). By default, it shows Roofline for a dominating operations data type (INT or FLOAT). You can switch to a different data type using theFLOAT/INTtoggle.This pane also reports the number of operations transferred per second, bandwidth for different memory levels, and an instruction mix histogram (for GPU only).
- Top five hotspots on CPU and GPU sorted by elapsed time.
- Performance characteristics of how well the application uses hardware resources.
- Information about the analyses executed and platforms that the data was collected on.
View an Interactive HTML Report
- Interactive HTML report that represents results in the similar way as in GUI and comprises GPU metrics, operations and memory information, a roofline chart, a source view, and grid data.Collect offload modeling data to view results forOffload ModelingandGPU Roofline Insightsperspectives in a single interactive HTML report.
- HTML Roofline report that contains a GPU Roofline chart and enables you to customize your hardware configuration to view how your application executes with given compute and memory parameters.
Save a Read-only Snapshot
- --cache-sourcesis an option to add application source code to the snapshot.
- --cache-binariesis an option to add application binaries to the snapshot.
- <snapshot-pathis a path and a name for the snapshot. For example, if you specify/tmp/new_snapshot, a snapshot is saved in atmpdirectory asnew_snapshot.advixeexpz. You can skip this and save the snapshot to a current directory assnapshot.XXX.advixeexpz
- Explore the basic performance metrics and identify top hotspots for optimization using the GPU Roofline Summary
- Visualize performance of your kernels against hardware-imposed performance ceilings and explore the relationships between your kernels and different memory levels using the GPU Roofline chart
- Analyze performance and memory metrics for specific kernels, identify headroom for optimization, and get actionable recommendations helping you optimize your application performance using the GPU Details tab
- Compare results of different optimization iterations using Roofline Compare functionality