A newer version of this document is available. Customers should click here to go to the newest version.
Prepare Application for Analysis
Windows* Targets
Linux* Targets
Embedded Linux* Targets
FreeBSD* Targets
QNX* Targets
Managed Code Targets
Android* Targets
Intel® Xeon Phi™ Processor Targets
Targets in Virtualized Environments
Targets in a Cloud Environment
Arbitrary Targets
Embedded System Targets
Build and Install the Sampling Drivers for Linux* Targets
Debug Information for Linux* Application Binaries
Compiler Switches for Performance Analysis on Linux* Targets
Enable Linux* Kernel Analysis
Resolution of Symbol Names for Linux-Loadable Kernel Modules
Analyze Statically Linked Binaries on Linux* Targets
Set Up Remote Linux* Target
User-Mode Sampling and Tracing Collection
Hardware Event-based Sampling Collection
Performance Snapshot
Algorithm Group
Microarchitecture Analysis Group
Parallelism Analysis Group
Input and Output Analysis
Accelerators Analysis Group
Platform Analysis Group
Platform Analysis
Hybrid CPU Analysis
Source Code Analysis
Custom Analysis
Energy Analysis
Code Profiling Scenarios
Control Data Collection
Manage Data Views
Manage Result Files
Switch Viewpoints
Control Window Synchronization
View Stacks
Manage Grid Views
Manage Timeline View
Change Threshold Values
Choose Data Format
Group and Filter Data
View Data on Inline Functions
Analyze Loops
Stitch Stacks for Intel® oneAPI Threading Building Blocks or OpenMP* Analysis
Search for Data
performance-snapshot Command Line Analysis
hotspots Command Line Analysis
anomaly-detection Command Line Analysis
threading Command Line Analysis
memory-consumption Command Line Analysis
hpc-performance Command Line Analysis
uarch-exploration Command Line Analysis
memory-access Command Line Analysis
tsx-exploration Command Line Analysis
tsx-hotspots Command Line Analysis
sgx-hotspots Command Line Analysis
gpu-hotspots Command Line Analysis
gpu-offload Command Line Analysis
npu
graphics-rendering Command Line Analysis
fpga-interaction Command Line Analysis
io Command Line Analysis
system-overview Command Line Analysis
runsa/runss Custom Command Line Analysis
Configure Analysis Options from Command Line
Collect System-Wide Data from Command Line
Collect Data on Remote Linux* Systems from Command Line
Configure GPU Analysis from Command Line
Specify Search Directories from Command Line
Specify Result Directory from Command Line
Pause Collection from Command Line
Manage Analysis Duration from Command Line
Limit Data Collection from Command Line
Option Descriptions and General Rules
allow-multiple-runs
analyze-kvm-guest
analyze-system
app-working-dir
archive
call-stack-mode
collect
collect-with
column
command
cpu-mask
csv-delimiter
cumulative-threshold-percent
custom-collector
data-limit
discard-raw-data
duration
filter
finalization-mode
finalize
format
group-by
help
import
inline-mode
knob
kvm-guest-kallsyms
kvm-guest-modules
limit
loop-mode
mrte-mode
no-follow-child
no-summary
no-unplugged-mode
quiet
report
report-knob
report-output
report-width
result-dir
resume-after
return-app-exitcode
ring-buffer
search-dir
show-as
sort-asc
sort-desc
source-object
source-search-dir
stack-size
start-paused
strategy
target-install-dir
target-system
target-tmp-dir
target-duration-type
target-pid
target-process
time-filter
trace-mpi
user-data-dir
verbose
version
Best Practices: Resolve Intel® VTune™ Profiler BSODs, Crashes, and Hangs in Windows* OS
Error Message: Application Sets Its Own Handler for Signal
Error Message: Cannot Enable Event-Based Sampling Collection
Error Message: Cannot Collect GPU Hardware Metrics
Error Message: Cannot Load Data File
Error Message: Cannot Locate Debugging Information
Error Message: Cannot Open Data
Error Message: Client Is Not Authorized to Connect to Server
Error Message: Root Privileges Required for Processor Graphics Events
Error Message: No Pre-built Driver Exists for This System
Error Message: Not All OpenCL™ API Profiling Callbacks Are Received
Error Message: Problem Accessing the Sampling Driver
Error Message: Required Key Not Available
Error Message: Scope of ptrace System Call Is Limited
Error Message: Stack Size Is Too Small
Error Message: Symbol File Is Not Found
Problem: Analysis of the .NET* Application Fails
Problem: Cannot Access VTune Profiler Documentation
Problem: CPU time for Hotspots or Threading Analysis is Too Low
Problem: 'Events= Sample After Value (SAV) * Samples' Is Not True If Multiple Runs Are Disabled
Problem: Guessed Stack Frames
Problem: GUI Hangs or Crashes
Problem: Inaccurate Sum in the Grid
Problem: Information Collected via ITT API Is Not Available When Attaching to a Process
Problem: No GPU Utilization Data Is Collected
Problem: Same Functions Are Compared As Different Instances
Problem: Skipped Stack Frames
Problem: Stack in the Top-Down Tree Window Is Incorrect
Problem: Stacks in Call Stack and Bottom-Up Panes Are Different
Problem: System Functions Appear in the User Functions Only Mode
Problem: VTune Profiler is Slow to Respond When Collecting or Displaying Data
Problem: VTune Profiler is Slow on X-Servers with SSH Connection
Problem: Unexpected Paused Time
Problem: {Unknown Timer} in the Platform Power Analysis Viewpoint
Problem: Unknown Critical Error Due to Disabled Loopback Interface
Problem: Unknown Frames
Problem: Unreadable Text on macOS*
Problem: Unsupported Microsoft* Windows* OS
Warnings about Accurate CPU Time Collection
Context Menu: Grid
Context Menus: Call Stack Pane
Context Menus: Project Navigator
Context Menus: Source/Assembly Window
Dialog Box: Binary/Symbol Search
Dialog Box: Source Search
Hot Keys
Menu: Customize Grouping
Menu: Intel VTune Profiler
Pane: Call Stack
Pane: Options - General
Pane: Options - Result Location
Pane: Options - Source/Assembly
Project Navigator
Pane: Timeline
Toolbar: Configure Analysis
Toolbar: Filter
Toolbar: Source/Assembly
Toolbar: Intel VTune Profiler
Window: Bandwidth - Platform Power Analysis
Window: Bottom-up
Window: Caller/Callee
Window: Cannot Find <file type> File
Window: Collection Log
Window: Compare Results
Window: Configure Analysis
Window: Core Wake-ups - Platform Power Analysis
Window: Correlate Metrics - Platform Power Analysis
Window: CPU C/P States - Platform Power Analysis
Window: Debug
Window: Event Count - Hardware Events
Window: Flame Graph
Window: Graphics - GPU Compute/Media Hotspots
Window: Graphics C/P States - Platform Power Analysis
Window: NC Device States - Platform Power Analysis
Window: Platform
Window: Platform Power Analysis
Window: Sample Count - Hardware Events
Window: SC Device States - Platform Power Analysis
Window: Summary
Window: System Sleep States - Platform Power Analysis
Window: Temperature/Thermal Sample - Platform Power Analysis
Window: Timer Resolution - Platform Power Analysis
Window: Top-down Tree
Window: Uncore Event Count - Hardware Events
Window: Wakelocks - Platform Power Analysis
Window: Summary - Input and Output Summary
Window: Summary - Microarchitecture Exploration
Window: Summary - GPU Analysis
Window: Summary - Hardware Events
Window: Summary - Hotspots by CPU Utilization
Window: Summary - HPC Performance Characterization
Window: Summary - Memory Consumption
Window: Summary - Memory Usage
Window: Summary - Platform Power Analysis
ALU0 Active
ALU0 Instructions
ALU1 Active
ALU1 Instructions
ALU2 Active
ALU2 Instructions
ALU0 and ALU1 Active
ALU0 and ALU2 Active
Average Time
Computing Threads Started
Computing Threads Started, Threads/sec
CPU Time
EU 2 FPU Pipelines Active
EU Array Active
EU Array Idle
EU Array Stalled/Idle
EU Array Stalled
EU IPC Rate
EU Send pipeline active
EU Threads Occupancy
Host to GPU Memory Read Bandwidth
Host-to-GPU Memory Write Bandwidth
Global
GPU EU Array Usage
GPU L3 Bound
GPU L3 Miss Ratio
GPU L3 Misses
GPU L3 Misses, Misses/sec
GPU Memory Read Bandwidth, GB/sec
GPU Memory Texture Read Bandwidth, GB/sec
GPU Memory Write Bandwidth, GB/sec
GPU Texel Quads Count, Count/sec
GPU Utilization
Instance Count
L3 Read Bandwidth
L3 Write Bandwidth
L3 Sampler Bandwidth, GB/sec
L3 Shader Bandwidth, GB/sec
LLC Miss Rate due GPU Lookups
LLC Miss Ratio due GPU Lookups
Local
Maximum GPU Utilization
Occupancy
PS EU Active %
PS EU Stall %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
Ratio to Max Bandwidth, %
Render/GPGPU Command Streamer Loaded
Samples Blended
Samples Killed in PS, pixels
Samples Written
Sampler Busy
Sampler Is Bottleneck
Shared Local Memory Read Bandwidth, GB/sec
Shared Local Memory Write Bandwidth, GB/sec
SIMD Width
Stack-to-stack Incoming Bandwidth
Stack-to-stack Outgoing Bandwidth
System Memory Read Bandwidth
System Memory Write Bandwidth
Size
Total, GB/sec
Total Time
Typed Memory Read Bandwidth, GB/sec
Typed Memory Write Bandwidth, GB/sec
Typed Reads Coalescence
Typed Writes Coalescence
Untyped Memory Read Bandwidth, GB/sec
Untyped Memory Write Bandwidth, GB/sec
Metric Description
See Also
Untyped Reads Coalescence
Untyped Writes Coalescence
VS EU Active
VS EU Stall
Untyped Memory Write Bandwidth, GB/sec
Metric Description
Bandwidth of memory written to untyped buffers (for example created with clCreateBuffer).
Parent topic: GPU Metrics Reference