User Guide

Contents

Hotspots View

Identify program units that took the most CPU time. These are recognized as
hotspots
. The Hotspots viewpoint is available for all analysis results.
Follow these steps to interpret performance data available in the Hotspots viewpoint:

Define a Performance Baseline

Start your analysis in the Summary window. Here you see general information about the execution of your application. Note that the Elapsed time is different from the application CPU time. The Elapsed time is the application time from start to termination. The application CPU time is the sum of the active processor time for all the threads that run the application. It does not include waiting times.
Use the Elapsed time value as a baseline to compare versions before and after optimization. When tuning the application, as you add more threads, the Elapsed time tends to decrease whereas the CPU time may increase.
If you ran the Hotspots analysis in the
hardware event-based sampling
mode, the analysis metrics in the
Summary
window display the Microarchitecture Usage metric. Use this metric to estimate the code efficiency on your hardware platform:
If this metric value is flagged as critical, consider running the analysis to dive deeper into hardware metrics.

Identify the Hottest Function

Get a list of the most time-consuming functions in the
Top Hotspots
section of the
Summary
window. Click on a hotspot function to explore its call flow and other related metrics in the Bottom-up view.
By default, the
Bottom-up
view presents a sorted display of CPU Time in descending order, starting with the most time-consuming functions. Start optimizing the functions with the largest CPU time.
Expand the
CPU Time
column to get more details on how effectively the CPU time was used:
Hotspots by CPU Utilization Viewpoint: Bottom-up Pane
Next, focus your tuning efforts on the program units with the largest
Poor
value. This means that your application underutilized the CPU time during the execution of these program units. The overall goal of optimization is to achieve
Ideal
(green ) or
OK
(orange ) CPU utilization state and shorten the
Poor
and
Over
CPU utilization values.
Identify Hot Code Paths
Switch to the Flame Graph window to quickly identify the hottest code paths in your application. Analyze the CPU time spent on each program unit and its related callee functions.
The flame graph plots stack profile population (sorted alphabetically) on the horizontal axis. The vertical axis shows stack depth, starting from zero at the bottom. The width of each element in the flame graph indicates the percentage of CPU time of the function (and its callees) to the total CPU time.

Identify Algorithm Issues

If you identify issues with the calling sequences in your application, you can improve performance by revising the order in which functions are called. Use these methods:
  • Top-down Tree pane: Analyze the Total and Self time data for callers and callees of the hotspot function to understand whether this time can be optimized.
  • Call Stack pane: Identify the highest contributing stack for the program unit(s) selected in the
    Bottom-up
    or
    Top-down Tree
    panes. Use the navigation buttons to see the different stacks that called the selected program unit(s). The contribution bar shows the contribution of the currently visible stack to the overall time spent by the selected program unit(s). You can also use the drop-down list in the
    Call Stack
    pane to view data for different types of stacks.
Stack data is available by default for the
user-mode sampling
mode. To have this data for the
hardware event-based sampling
mode, you need to enable the
Collect stacks
option in the Hotspots analysis configuration.

Analyze Source

Double-click the hottest function to view its related source code in the Source/Assembly window. Open the code editor directly from
Intel® VTune™
Profiler
and improve your code (for example, minimizing the number of calls to the hotspot function).

What's Next

If you ran the analysis with the default
Show additional performance insights
option, the
Summary
view will include the
Insights
section that provides additional metrics for your target such as efficiency of the hardware usage and vectorization. This information helps you identify potential next steps for your performance analysis and understand where you could focus your optimization efforts.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.