Visible to Intel only — GUID: GUID-34FC7D27-3F7F-44FE-A08F-3FE4DE881E8A
Prepare Application for Analysis
Windows* Targets
Linux* Targets
Embedded Linux* Targets
FreeBSD* Targets
QNX* Targets
Managed Code Targets
Android* Targets
Intel® Xeon Phi™ Processor Targets
Targets in Virtualized Environments
Targets in a Cloud Environment
Arbitrary Targets
Embedded System Targets
Build and Install the Sampling Drivers for Linux* Targets
Debug Information for Linux* Application Binaries
Compiler Switches for Performance Analysis on Linux* Targets
Enable Linux* Kernel Analysis
Resolution of Symbol Names for Linux-Loadable Kernel Modules
Analyze Statically Linked Binaries on Linux* Targets
Set Up Remote Linux* Target
User-Mode Sampling and Tracing Collection
Hardware Event-based Sampling Collection
Performance Snapshot
Algorithm Group
Microarchitecture Analysis Group
Parallelism Analysis Group
Input and Output Analysis
Accelerators Analysis Group
Platform Analysis Group
Platform Analysis
Hybrid CPU Analysis
Source Code Analysis
Custom Analysis
Energy Analysis
Code Profiling Scenarios
Control Data Collection
Manage Data Views
Manage Result Files
Switch Viewpoints
Control Window Synchronization
View Stacks
Manage Grid Views
Manage Timeline View
Change Threshold Values
Choose Data Format
Group and Filter Data
View Data on Inline Functions
Analyze Loops
Stitch Stacks for Intel® oneAPI Threading Building Blocks or OpenMP* Analysis
Search for Data
performance-snapshot Command Line Analysis
hotspots Command Line Analysis
anomaly-detection Command Line Analysis
threading Command Line Analysis
memory-consumption Command Line Analysis
hpc-performance Command Line Analysis
uarch-exploration Command Line Analysis
memory-access Command Line Analysis
tsx-exploration Command Line Analysis
tsx-hotspots Command Line Analysis
sgx-hotspots Command Line Analysis
gpu-hotspots Command Line Analysis
gpu-offload Command Line Analysis
npu
graphics-rendering Command Line Analysis
fpga-interaction Command Line Analysis
io Command Line Analysis
system-overview Command Line Analysis
runsa/runss Custom Command Line Analysis
Configure Analysis Options from Command Line
Collect System-Wide Data from Command Line
Collect Data on Remote Linux* Systems from Command Line
Configure GPU Analysis from Command Line
Specify Search Directories from Command Line
Specify Result Directory from Command Line
Pause Collection from Command Line
Manage Analysis Duration from Command Line
Limit Data Collection from Command Line
Option Descriptions and General Rules
allow-multiple-runs
analyze-kvm-guest
analyze-system
app-working-dir
archive
call-stack-mode
collect
collect-with
column
command
cpu-mask
csv-delimiter
cumulative-threshold-percent
custom-collector
data-limit
discard-raw-data
duration
filter
finalization-mode
finalize
format
group-by
help
import
inline-mode
knob
kvm-guest-kallsyms
kvm-guest-modules
limit
loop-mode
mrte-mode
no-follow-child
no-summary
no-unplugged-mode
quiet
report
report-knob
report-output
report-width
result-dir
resume-after
return-app-exitcode
ring-buffer
search-dir
show-as
sort-asc
sort-desc
source-object
source-search-dir
stack-size
start-paused
strategy
target-install-dir
target-system
target-tmp-dir
target-duration-type
target-pid
target-process
time-filter
trace-mpi
user-data-dir
verbose
version
Best Practices: Resolve Intel® VTune™ Profiler BSODs, Crashes, and Hangs in Windows* OS
Error Message: Application Sets Its Own Handler for Signal
Error Message: Cannot Enable Event-Based Sampling Collection
Error Message: Cannot Collect GPU Hardware Metrics
Error Message: Cannot Load Data File
Error Message: Cannot Locate Debugging Information
Error Message: Cannot Open Data
Error Message: Client Is Not Authorized to Connect to Server
Error Message: Root Privileges Required for Processor Graphics Events
Error Message: No Pre-built Driver Exists for This System
Error Message: Not All OpenCL™ API Profiling Callbacks Are Received
Error Message: Problem Accessing the Sampling Driver
Error Message: Required Key Not Available
Error Message: Scope of ptrace System Call Is Limited
Error Message: Stack Size Is Too Small
Error Message: Symbol File Is Not Found
Problem: Analysis of the .NET* Application Fails
Problem: Cannot Access VTune Profiler Documentation
Problem: CPU time for Hotspots or Threading Analysis is Too Low
Problem: 'Events= Sample After Value (SAV) * Samples' Is Not True If Multiple Runs Are Disabled
Problem: Guessed Stack Frames
Problem: GUI Hangs or Crashes
Problem: Inaccurate Sum in the Grid
Problem: Information Collected via ITT API Is Not Available When Attaching to a Process
Problem: No GPU Utilization Data Is Collected
Problem: Same Functions Are Compared As Different Instances
Problem: Skipped Stack Frames
Problem: Stack in the Top-Down Tree Window Is Incorrect
Problem: Stacks in Call Stack and Bottom-Up Panes Are Different
Problem: System Functions Appear in the User Functions Only Mode
Problem: Intel® VTune™ Profiler is Slow to Respond When Collecting or Displaying Data
Problem: Intel® VTune™ Profiler is Slow on X-Servers with SSH Connection
Problem: Unexpected Paused Time
Problem: {Unknown Timer} in the Platform Power Analysis Viewpoint
Problem: Unknown Critical Error Due to Disabled Loopback Interface
Problem: Unknown Frames
Problem: Unsupported Microsoft* Windows* OS
User Interface Reference
CPU Metrics Reference
Assists
Available Core Time
Average Bandwidth
Average CPU Frequency
Average CPU Usage
Average Frame Time
Average Latency (cycles)
Average Logical Core Utilization
Average Physical Core Utilization
Average Task Time
Back-End Bound
Memory Bandwidth
Contested Accesses (Intra-Tile)
LLC Miss
UTLB Overhead
Port Utilization
Port 0
Port 1
Port 2
Port 3
Port 4
Port 5
Port 6
Port 7
BACLEARS
Bad Speculation (Cancelled Pipeline Slots)
Bad Speculation (Back-End Bound Pipeline Slots)
FP Arithmetic
FP Assists
FP Scalar
FP Vector
FP x87
MS Assists
Branch Mispredict
Bus Lock
Cache Bound
Clears Resteers
Clockticks per Instructions Retired (CPI)
Clockticks Vs. Pipeline Slots Based Metrics
CPI Rate
CPI Rate (Intel Atom® processor)
CPU Time
Core Bound
CPU Frequency
CPU Utilization
CPU Utilization (OpenMP)
Cycles of 0 Ports Utilized
Cycles of 1 Port Utilized
Cycles of 2 Ports Utilized
Cycles of 3+ Ports Utilized
Divider
(Info) DSB Coverage
DTLB Store Overhead
Effective CPU Utilization
Effective Physical Core Utilization
Effective Time
Elapsed Time
Elapsed Time (Global)
Elapsed Time (Total)
Estimated BB Execution Count
Estimated Ideal Time
Execution Stalls
False Sharing
Far Branch
Flags Merge Stalls
FPU Utilization
% of Packed FP Instructions
% of 128-bit Packed Floating Point Instructions
% of 256-bit Packed Floating Point Instructions
% of Packed SIMD Instructions
% of Scalar FP Instructions
% of Scalar SIMD Instructions
FP Arithmetic/Memory Read Instructions Ratio
FP Arithmetic/Memory Write Instructions Ratio
Loop Type
SP FLOPs per Cycle
Vector Capacity Usage
Vector Instruction Set
Front-End Bandwidth
Front-End Bandwidth DSB
Front-End Bandwidth LSD
Front-End Bandwidth MITE
Front-End Bound
Front-End Other
Branch Resteers
DSB Switches
ICache Misses
ITLB Overhead
Length Changing Prefixes
MS Switches
Front-End Latency
General Retirement
Hardware Event Count
Hardware Event Sample Count
ICache Line Fetch
Ideal Time
Imbalance or Serial Spinning
Inactive Sync Wait Count
Inactive Sync Wait Time
Inactive Time
Inactive Wait Count
Inactive Wait Time
Inactive Wait Time with poor CPU Utilization
Incoming Bandwidth Bound
Incoming Packet Rate Bound
Instruction Starvation
Interrupt Time
I/O Wait Time
IPC
L1 Bound
4K Aliasing
DTLB Overhead
FB Full
Loads Blocked by Store Forwarding
Lock Latency
Split Loads
L1 Hit Rate
L1D Replacement Percentage
L1D Replacements
L1I Stall Cycles
L2 Bound
L2 Hit Bound
L2 Hit Rate
L2 HW Prefetcher Allocations
L2 Input Requests
L2 Miss Bound
L2 Miss Count
L2 Replacement Percentage
L2 Replacements
L3 Bound
Contested Accesses
Data Sharing
L3 Latency
LLC Hit
SQ Full
LLC Load Misses Serviced By Remote DRAM
LLC Miss Count
LLC Replacement Percentage
LLC Replacements
Local DRAM Access Count
Logical Core Utilization
Loop Entry Count
(Info) LSD Coverage
Machine Clears
Max DRAM Single-Package Bandwidth
Max DRAM System Bandwidth
MCDRAM Bandwidth Bound
MCDRAM Cache Bandwidth Bound
MCDRAM Flat Bandwidth Bound
Memory Bound
DRAM Bound
DRAM Bandwidth Bound
UPI Utilization Bound
Memory Latency
Local DRAM
Remote Cache
Remote DRAM
NUMA: % of Remote Accesses
Memory Efficiency
Microarchitecture Usage
Microcode Sequencer
Mispredicts Resteers
MO Machine Clear Overhead
MPI Imbalance
MPI Rank on the Critical Path
MS Entry
MUX Reliability
OpenMP* Analysis. Collection Time
OpenMP Region Time
Other
Outgoing Bandwidth Bound
Outgoing Packet Rate Bound
Overhead Time
Page Walk
Parallel Region Time
Paused Time
Persistent Memory Bound
Pipeline Slots
OpenMP* Potential Gain
Imbalance
Lock Contention
Pre-Decode Wrong
Remote Cache Access Count
Remote DRAM Access Count
Remote / Local DRAM Ratio
Retire Stalls
Retiring
Self Time and Total Time
Serial CPU Time
MPI Busy Wait Time
Serial Time (outside parallel regions)
SIMD Assists
SIMD Compute-to-L1 Access Ratio
SIMD Compute-to-L2 Access Ratio
SIMD Instructions per Cycle
Slow LEA Stalls
SMC Machine Clear
SP GFLOPS
Spin Time
Communication (MPI)
Other (Spin)
Spin and Overhead Time
Atomics
Creation
Other (Overhead)
Reduction
Scheduling
Tasking
Split Stores
Store Bound
Store Latency
Task Time
Thread Concurrency
Thread Oversubscription
Total Iteration Count
[uOps]
VPU Utilization
Wait Count
Wait Rate
Wait Time
GPU Metrics Reference
OpenCL™ Kernel Analysis Metrics Reference
Energy Analysis Metrics Reference
Intel Processor Events Reference
Context Menu: Grid
Context Menus: Call Stack Pane
Context Menus: Project Navigator
Context Menus: Source/Assembly Window
Dialog Box: Binary/Symbol Search
Dialog Box: Source Search
Hot Keys
Menu: Customize Grouping
Menu: Intel VTune Profiler
Pane: Call Stack
Pane: Options - General
Pane: Options - Result Location
Pane: Options - Source/Assembly
Project Navigator
Pane: Timeline
Toolbar: Configure Analysis
Toolbar: Filter
Toolbar: Source/Assembly
Toolbar: Intel VTune Profiler
Window: Bandwidth - Platform Power Analysis
Window: Bottom-up
Window: Caller/Callee
Window: Cannot Find <file type> File
Window: Collection Log
Window: Compare Results
Window: Configure Analysis
Window: Core Wake-ups - Platform Power Analysis
Window: Correlate Metrics - Platform Power Analysis
Window: CPU C/P States - Platform Power Analysis
Window: Debug
Window: Event Count - Hardware Events
Window: Flame Graph
Window: Graphics - GPU Compute/Media Hotspots
Window: Graphics C/P States - Platform Power Analysis
Window: NC Device States - Platform Power Analysis
Window: Platform
Window: Platform Power Analysis
Window: Sample Count - Hardware Events
Window: SC Device States - Platform Power Analysis
Window: Summary
Window: System Sleep States - Platform Power Analysis
Window: Temperature/Thermal Sample - Platform Power Analysis
Window: Timer Resolution - Platform Power Analysis
Window: Top-down Tree
Window: Uncore Event Count - Hardware Events
Window: Wakelocks - Platform Power Analysis
Window: Summary - Input and Output Summary
Window: Summary - Microarchitecture Exploration
Window: Summary - GPU Analysis
Window: Summary - Hardware Events
Window: Summary - Hotspots by CPU Utilization
Window: Summary - HPC Performance Characterization
Window: Summary - Memory Consumption
Window: Summary - Memory Usage
Window: Summary - Platform Power Analysis
ALU0 Active
ALU0 Instructions
ALU1 Active
ALU1 Instructions
ALU2 Active
ALU2 Instructions
ALU0 and ALU1 Active
ALU0 and ALU2 Active
ALU0 and XMX Utilization
Average Time
Computing Threads Started
Computing Threads Started, Threads/sec
CPU Time
EU 2 FPU Pipelines Active
EU Array Active
EU Array Idle
EU Array Stalled/Idle
EU Array Stalled
EU IPC Rate
EU Send pipeline active
EU Threads Occupancy
Global
GPU EU Array Usage
GPU Instruction Cache L3 Miss Ratio
GPU L3 Atomics
GPU L3 Bound
GPU L3 Miss Ratio
GPU L3 Misses
GPU L3 Misses, Misses/sec
GPU Load Store Cache Miss Ratio
GPU Load Store Cache L3 Miss Ratio
GPU LSC Atomics
GPU LSC Fences
GPU Media Read Requests
GPU Media Write Requests
GPU SLM Atomics
GPU SLM Fences
GPU Memory Read Bandwidth, GB/sec
GPU Memory Texture Read Bandwidth, GB/sec
GPU Memory Write Bandwidth, GB/sec
GPU Sampler L3 Miss Ratio
GPU Texel Quads Count, Count/sec
GPU Utilization
Graphics Security Controller Busy
Host to GPU Memory Read Bandwidth
Host-to-GPU Memory Write Bandwidth
Instance Count
Instruction Cache Miss Ratio
L3 Busy
L3 Input Available
L3 Instruction Cache Bandwidth
L3 Load Store Cache Read Bandwidth
L3 Load Store Cache Write Bandwidth
L3 Miss Ratio
L3 Output Ready
L3 Read Bandwidth
L3 SQ Full
L3 Stalled
L3 Write Bandwidth
L3 Sampler Bandwidth, GB/sec
L3 Shader Bandwidth, GB/sec
LLC Miss Rate due GPU Lookups
LLC Miss Ratio due GPU Lookups
LSC Input Available
LSC Output Ready
LSC Partial Writes
Local
Maximum GPU Utilization
Multiple Pipe Utilization
Occupancy
PS EU Active %
PS EU Stall %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
Render/GPGPU Command Streamer Loaded
Sampler Input Available
Sampler Output Ready
Samples Blended
Samples Killed in PS, pixels
Samples Written
Sampler Busy
Sampler Is Bottleneck
Shared Local Memory Read Bandwidth, GB/sec
Shared Local Memory Write Bandwidth, GB/sec
SIMD Width
SLM Bank Conflicts
Stack-to-stack Incoming Bandwidth
Stack-to-stack Outgoing Bandwidth
System Memory Read Bandwidth
System Memory Write Bandwidth
Size
Total, GB/sec
Thread Dispatcher Active
TLB Misses
Total Time
Typed Memory Read Bandwidth, GB/sec
Typed Memory Write Bandwidth, GB/sec
Typed Reads Coalescence
Typed Writes Coalescence
Untyped Memory Read Bandwidth, GB/sec
Untyped Memory Write Bandwidth, GB/sec
Untyped Reads Coalescence
Untyped Writes Coalescence
Video Codec Busy
Video Codec Read Requests
Video Codec Write Requests
Video Codec 2 Busy
Video Codec 2 Read Requests
Video Codec 2 Write Requests
Video Enhancement Busy
Video Enhancement Read Requests
Video Enhancement Write Requests
Video Enhancement 2 Busy
Video Enhancement 2 Read Requests
Video Enhancement 2 Write Requests
VS EU Active
VS EU Stall
XVE Barrier Stall
XVE Bit Manipulation Instructions
XVE Control Stall
XVE Dist or Acc Stall
XVE INT16\INT32\INT64\FP16\FP32\FP64 Instructions
XVE FP16\BF16\INT8\INT4\INT2 XMX Instructions
XVE Instruction Fetch Stall
XVE Pipe Stall
XVE Send Stall
XVE SBID Stall
XVE XMX Instructions
XVE XMX Pipeline Active