Microarchitecture Exploration View
Explore the
Intel® VTune™
Microarchitecture Exploration viewpoint for the PMU analysis based on the top-down microarchitecture analysis method that uses key hardware metrics organized by execution categories so that you could easily identify what portion of the pipeline is responsible for the majority of execution time.
Profiler
When the
Microarchitecture Exploration analysis (formerly known as General Exploration) is complete, the
VTune
opens the Microarchitecture Exploration viewpoint. The hierarchy of
event-based metrics in this viewpoint depends on your hardware architecture. For example, starting with the Intel microarchitecture code name Ivy Bridge, the
Profiler
VTune
analyzes execution categories based on the
Top-Down Microarchitecture Analysis Method:
Profiler

The four leaf categories serve as high-level performance metrics in the Microarchitecture Exploration viewpoint.
Each metric is an event ratio defined by Intel architects and has its own predefined threshold.
VTune
analyzes a ratio value for each aggregated program unit (for example, function). When this value exceeds the threshold and the program unit has more then 5% of CPU time from collection CPU time, it signals a potential performance problem and highlights such a value in pink.
Profiler
- For a detailed tuning methodology behind the Microarchitecture Exploration analysis and some of the complexities associated with this analysis, see .
- For architecture-specific Tuning Guides, visit https://software.intel.com/en-us/articles/processor-specific-performance-analysis-papers.
To interpret the performance data provided during the hardware event-based sampling analysis, you may follow the steps below:
Learn Metrics and Define a Performance Baseline
In the Microarchitecture Exploration viewpoint, click the
Summary
tab to switch to the
Summary window.
The first section displays the summary statistics on the overall application execution per hardware-related metrics measured in
Pipeline Slots or Clockticks. Metrics are organized by execution categories in a list and also represented as a
µPipe diagram. To view a metric description, mouse over the help icon
:


In the example above, mousing over the
L1 Bound
metric displays the metric description in the tooltip.
A flagged metric value signals a performance issue for the whole application execution. Mouse over the flagged value to read the issue description:

You may use the performance issues identified by the
VTune
as a baseline for comparison of versions before and after optimization. Your primary performance indicator is the Elapsed time value.
Profiler
Grayed out metric values indicate that the data collected for this metric is unreliable. This may happen, for example, if the number of samples collected for PMU events is too low. In this case, when you hover over such an unreliable metric value, the
VTune
displays a message:
Profiler

You may either ignore this data, or rerun the collection with the data collection time, sampling interval, or workload increased.
By default, the
VTune
collects Microarchitecture Exploration data in the
Profiler
Detailed
mode. In this mode, all metric names in the Summary view are hyperlinks. Clicking such a hyperlink opens the
Bottom-up
window and sorts the data in the grid by the selected metric. The lightweight
Summary
collection mode is limited to the Summary view statistics.
Identify Hardware Issues
To view hardware issues per a program unit, switch to the
Bottom-up pane. Each row represents a program unit and percentage of time used by this unit. Program units that take more than 5% of the CPU time are considered as
hotspots
. By default, the
VTune
sorts the data in the descending order by Clockticks and provides the hotspots at the top of the list.
Profiler
Most of the columns in the
Bottom-up
pane represent a hardware performance metric.
VTune
calculates a metric based on the formula provided by Intel architects. Mouse over the column header to read the metric description. By default, metric values are represented as numbers. You can change the representation mode with the
Profiler
Show Data As
context menu option.
The right pane displays a context summary for the selected function. Analyze per-function hardware metrics and their visual representation on the µPipe diagram to estimate the contribution of this particular function to the overall performance.
Each metric has a threshold value. If the metric value exceeds the threshold and the program unit is a hotspot, the
VTune
highlights this value in pink as performance-critical. Mouse over each pink cell to read a description of the issue and recommended solution (if any).
Profiler

In the example above, created on the Intel microarchitecture code name Skylake, the
VTune
identified the
Profiler
sphere_intersect
function as one of the biggest hotspots that took much CPU time.
VTune
detected that the back-end portion of the pipeline caused the stalls. For the back-end, the
Profiler
VTune
identified
Profiler
Memory Bound > L1 Bound
issue as a dominant bottleneck. 14.6% of Clockticks used in this function was stalled missing L1 data cache. This means that if you focus on this function hotspot and optimize it, you can potentially gain ~15% speed-up for this function.
VTune
is able to identify the most common types of pipeline bottlenecks. You may go deeper for more details. If the deeper levels of the metrics do not display any data, it means that the
Profiler
VTune
cannot see a dominant bottleneck on the lower level.
Profiler
Analyze Source
When you identified a critical function, double-click it to open the
Source
/Assembly
window and analyze the source code.

The
Source
/Assembly
window displays locator metrics that show what code contributed the most to the issue represented by the metric. For example, if you have the Back-End Bound metric equal to 60% for your function, the source view for this function splits the 60% value across function source lines or instructions to help you identify a source line/instruction with the biggest value contributing the most to the total 60% Back-End Bound metric.
Use the
hotspots navigation toolbar buttons to navigate to the biggest hotspot for each locator metric and identify the code to optimize.
What's Next
- You may view the collected data using the Hotspots viewpoint or run the Hotspots analysis type. Analyzing the source and assembly code for the hotspot function in the Hotspots viewpoint helps identify which instruction contributes most to the poor performance and how much CPU time the hotspot source line takes. Such a code analysis could be useful for the hotspots that do not show any issues in the sub-metrics but do show problems at the upper level of metrics (see the example above).
- Run the comparison analysis to understand the performance gain you obtained after your optimization.
- You may create your custom analysis configuration and monitor events you are interested in.
- For information on processor events, see the Intel Processor Event Reference.
- Explore tuning recipes for hardware issues in thePerformance Analysis Cookbook.