CPU / Memory Roofline Insights Perspective from Command Line
- Collect OpenCL™ kernels timings and memory data using the Survey analysis with GPU profiling.
- Measure the hardware limitations and collect floating-point and integer operations data using the Characterization analysis with GPU profiling.Intel® Advisorcalculates compute operations (FLOP and INTOP) as a weighted sum of the following groups of instructions: BASIC COMPUTE, FMA, BIT, DIV, POW, MATHIntel Advisorautomatically determines data type in the collected operations using thedstregister.
Plot a CPU Roofline Chart
- Run the shortcut--collect=rooflinecommand line action to execute the Survey and Characterization analyses with a single command. This method is recommended to run theCPU / Memory Roofline Insightsperspective, but it does not support MPI applications.
- Run the Survey and Characterization analyses with the--collect=surveyand--collect=tripcountscommand actions separately one by one. This method is recommended if you want to analyze an MPI application.
advisor --collect=roofline --project-dir=./advi_results -– ./myApplication
- Run the Survey analysis.advisor --collect=survey --project-dir=./advi_results -- ./myApplication
- Run the Characterization analysis to collect trip counts and FLOP data:advisor --collect=tripcounts --flops --project-dir=./advi_results -- ./myApplication
- Roofline to plot a Roofline chart. This step sequentially runs the Survey and Characterization (trip counts and FLOP) analyses.
- Memory Access Patterns (optional) to identify memory traffic data and memory usage issues.
- Dependencies (optional) to identify loop-carried dependencies that might limit offloading.
Enable advanced collection of call stack data. Use this option to get a CPU Roofline with callstacks.
Model CPU cache behavior on your target application. Use this option to get a Memory-level CPU Roofline that shows data for all memory levels.
Set the cache hierarchy to collect modeling data for CPU cache behavior. Use with
The value should follow the template: [
<cacheline_size>] for each of three cache levels separated with a
Set the cache associativity for modeling CPU cache behavior: 1 | 2 | 4 | 8 (default) | 16. Use with
Set the focus for modeling CPU cache behavior:
utilization. Use with
Select loops for the analysis by loop IDs, source locations, or criteria such as
markup=. This option is required.
Model CPU cache behavior on your target application.
Set the cache line size (in bytes) for modeling CPU cache behavior: 4 | 8 | 16 | 32 | 64 (default) | 128 | 256 | 512 | 1024 | 2048 | 4096 | 8192 | 16384 | 32768 | 65536. Use with
Set the cache set size (in bytes) for modeling CPU cache behavior: 256 | 512 | 1024 | 2048 | 4096 (default) | 8192. Use with
Select loops for the analysis by loop IDs, source locations, criteria such as
markup=. This option is required.
Mark all potential reductions with a specific diagnostic.
View the Results
- Roofline chart that plots an application's achieved performance and arithmetic intensity against the CPU maximum achievable performance
- Additional information about your application in theAdvanced Viewpane under the chart, including source code, detailed code analytics for trip counts and FLOP/INTOP data, optimization recommendations, and compiler diagnosticsSelect a dot on the Roofline chart to see details for the selected loop in all tabs of theAdvanced Viewpane
- Expand thePerformance Metrics Summarydrop-down to view the summary performance characteristics for your application.
- Double-click a dot on the chart to see a roof ruler that point to exact roofs that bound the dot.
- Hover over a dot to see a detailed tooltip with performance metrics.
- Select memory levels to show dots for from the filter drop-down list on the chart.
- Double-click a dot on the chart to expand it for other memory levels and see roof rulers.
- --cache-sourcesis an option to add application source code to the snapshot.
- --cache-binariesis an option to add application binaries to the snapshot.
- <snapshot-pathis a path and a name for the snapshot. For example, if you specify/tmp/new_snapshot, a snapshot is saved in atmpdirectory asnew_snapshot.advixeexpz. You can skip this and save the snapshot to a current directory assnapshot.XXX.advixeexpz