User Guide

  • 2021.5
  • 06/28/2021
  • Public Content

Introducing Application Performance Snapshot

Intel® VTune™
Application Performance Snapshot for a quick view into different aspects of compute intensive applications' performance, such as MPI and OpenMP* usage, CPU utilization, memory access efficiency, vectorization, I/O, and memory footprint. Application Performance Snapshot displays key optimization areas and suggests specialized tools for tuning particular performance aspects, such as
Intel VTune
and Intel® Advisor. The tool is designed to be used on large MPI workloads and can help analyze different scalability issues.
Application Performance Snapshot comes bundled with all installations of
on Linux* OS.
from one of these locations:

What's New

This User's Guide documents Application Performance Snapshot for Linux* OS.
This is a change log for the current and previous product releases:

Application Performance Snapshot 2021.5

  • Introduced a mechanism for Outlier Detection from HTML and CLI reports.
  • Improved the
    Metric Tooltips
    with visualization of ranges of average metrics, with their minimum, maximum, and average values.

Application Performance Snapshot 2021.3

  • Enabled collection of PCIe bandwidth usage. Requires the sampling driver to be loaded.
  • Added
    PCIe Bandwidth
    metrics to the
  • Added ability to filter by node.
  • Added the
    Node Topology
    report, which shows the association between ranks, nodes, and PCIe devices.
  • Added the
    report, a configurable table capable of displaying any collected metric for each rank, node, or device.

Application Performance Snapshot 2020 Update 1

  • Added metrics to explore GPU compute efficiency for Intel Graphics. The metric set includes GPU Time, GPU IPC, GPU Utilization and OpenMP* offload efficiency metrics like offload region overhead and data transfer cost. The application has to be compiled with Intel® C/C++ Compiler (Beta) 2021.1 - Beta 05 available in several Intel® oneAPI Toolkits (Beta), such as the Intel® oneAPI HPC Toolkit (Beta).

Application Performance Snapshot 2020

  • Max and Bandwidth metrics to better estimate the efficiency of DRAM, MCDRAM, Intel® persistent memory and Intel® Omni-Path Architecture usage.
  • Easier diagnostics of MPI communication patterns with the rank-to-rank communication diagram of Application Performance Snapshot shown by message volume or communication time.
  • Full-featured OpenMPI application support.
  • Streamlined vectorization metrics.

Application Performance Snapshot 2019 Update 5

  • DRAM Bandwidth information in Memory Stalls metric now includes Peak and Bound metrics. These metrics inform about memory bandwidth use, particularly in applications which execute in phases that have varying memory requirements.

Application Performance Snapshot 2019 Update 4

  • Ability to collect internal IDs of communicators provided by Intel MPI. This feature is supported for versions of Application Performance Snapshot as well as Intel MPI that are 2019 Update 4 or newer.

Application Performance Snapshot 2019 Update 3

  • Ability to generate HTML-based rank-to-rank communication diagram by message volume to better visualize MPI application communication patterns.

Application Performance Snapshot 2019 Update 2

  • Full-featured OpenMPI* support
  • Improved vectorization efficiency metrics
  • MPI Imbalance time is no longer calculated on the default stat level 1 to minimize collection overhead on that level
  • aps-report: added option to display statistics only for the selected set of MPI functions
  • MPI collector general optimizations

Application Performance Snapshot 2019 Update 1

  • MPI Imbalance collection extended with a mode that enables measuring pure application imbalance. This mode is applicable to MPI implementations binary compatible with the MPICH. If required, you can switch off the imbalance collection to minimize collection overhead.
  • MPI tracing overhead improvements with a noticeable impact on cases with a large number of ranks.

Application Performance Snapshot 2019

  • Intel® Omni-Path Architecture Interconnect Bandwidth and Packet rate metrics added to explore MPI communication bottlenecks.
  • Added an HTML-based rank-to-rank communication diagram to better visualize MPI application communication patterns.

Application Performance Snapshot 2018 Update 3 and 2019 Beta Update

  • The
    utility added the
    option, which allows the report to be generated in either text (*.txt) or comma-separated (*.csv) format. The CSV format can be useful for report processing automation or export to spreadsheet programs such as Microsoft Excel*.
  • The Rank-to-Rank data transfers report was enriched with an aggregated communication time column.
  • MPI trace file size was compacted with compression and minimal statistic level set by default. Some reports generated by the
    utility will be inapplicable with minimal statistic level. See Controlling Amount of Collected Data for more information.
  • Report generation time with the
    utility was significantly improved.

Application Performance Snapshot 2018 Update 2

Application Performance Snapshot 2018 Update 1

  • Removed restrictions for
    region numbers.

Application Performance Snapshot 2018

  • The tool is now invoked as
    rather than
  • Result directory change from

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at