Intel® VTune™ Profiler

Find and Fix Performance Bottlenecks Quickly and Realize All the Value of Your Hardware

Performance Analysis for Applications & Systems

Intel® VTune™ Profiler optimizes application performance, system performance, and system configuration for AI, HPC, cloud, IoT, media, storage, and more.

CPU, GPU, and NPU: Tune the entire application’s performance―not just the accelerated portion.
Multilingual: Profile SYCL*, C, C++, C#, Fortran, OpenCL™ code, Python*, Google Go* programming language, Java*, .NET, Assembly, or any combination of languages.
System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code.
Power: Optimize performance while avoiding power- and thermal-related throttling.

Download as Part of the Toolkit

Intel VTune Profiler is included in the Intel® oneAPI Base Toolkit, which is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.

Get It Now

Download the Stand-Alone Version

A stand-alone download of Intel VTune Profiler is available. You can download binaries from Intel or choose your preferred repository.

Download

Features

Algorithm Optimization

Locate hot spots—the most time-consuming parts of your code.
Visualize hot code paths and time spent in each function and with its callees with Flame Graph.

Analyze Hot Code Paths

Analyze Hot Spots

Microarchitecture and Memory Bottlenecks

Identify the most significant hardware issues that affect the performance of your application with microarchitecture exploration analysis.
Pinpoint memory-access-related issues such as cache misses and high-bandwidth problems.

Code-Tuning Methods for Intel CPU Microarchitecture

Profile a Memory-Bound Application

Accelerators and XPUs

Optimize GPU offload schema and data transfers for SYCL, OpenCL code, Microsoft DirectX*, or OpenMP* offload code. Identify the most time-consuming GPU kernels for further optimization.
Analyze GPU-bound code for performance bottlenecks caused by microarchitectural constraints or inefficient kernel algorithms.
Understand how much data is transferred between a neural processing unit (NPU) and DDR memory and identify the most time-consuming tasks running on the NPU.

Optimize Software for Intel GPUs

Profile OpenMP Offload Code on a GPU

Show more Show less

Parallelism

Examine how efficiently the code is threaded. Identify threading issues that impact performance.
Evaluate compute-intensive or throughput HPC applications for efficient CPU use, vectorization, and memory use.

Method for OpenMP Code Analysis

Schedule Overhead in Intel® oneAPI Threading Building Blocks (oneTBB) Applications

Platform and I/O

Locate performance bottlenecks in I/O-intensive applications. Explore how effectively the hardware processes I/O traffic generated by external PCIe* devices or integrated accelerators.
Get a fine-grained overview for short-running workloads with System Overview.

Effective Use of Intel® Data Direct I/O Technology

Multi-Node

Characterize performance aspects of large-scale message passing interface (MPI) and OpenMP workloads.
Identify scalability issues and get recommendations for in-depth analysis.

Profile MPI Applications

Show more Show less

What's New in 2025.1

Identify performance bottlenecks of AI workloads that are calling DirectML or Windows* Machine Learning (WinML) APIs.
Understand the overall accelerator performance by seeing GPU and NPU offload bottlenecks in one view.
Pinpoint the most time-consuming code sections and critical code paths for Python 3.12.

For a more complete and up-to-date list, see the release notes.

Get Started

Show more Show less

What Customers Are Saying

Show more Show less

Case Studies

Specifications

Processor:

Intel® Xeon® processor family (based on formerly code named Ice Lake)

3rd generation Intel® Xeon® Scalable processor family (or later)

10th generation Intel® Core™ processor (or later)

GPUs:

Intel® UHD Graphics for 11th generation Intel processors or newer

Intel® Iris® X^e graphics

Intel® Arc™ graphics

Intel® Server GPU

Intel® Data Center GPU Flex Series

Intel® Data Center GPU Max Series

Languages:

SYCL

C and C++

Fortran

OpenCL code

Google Go programming language

Java

Python

.NET

Development environments:

Windows*: Microsoft Visual Studio*, Visual Studio Code

Linux*: Eclipse*

Virtual machine support: Kernel-based virtual machine (KVM), Hyper-V*, VMware*

Container support: Docker*, Singularity*, LXC, Apache Mesos*

Interface: Desktop or web GUI, command line

For more information, see the system requirements.

Host operating systems:

Windows

Linux

Target operating systems:

Windows

Linux

FreeBSD*

Compilers:

Intel® compilers

Microsoft* compilers

GNU Compiler Collection (GCC)*

Threading analysis:

OpenMP

Intel® oneAPI Threading Building Blocks

Native threads

Distributed environments:

MPI (MPICH-based, OpenMPI)

Get Help

Your success is our success. Access these support resources when you need assistance.

Related Tools

Intel® Advisor

This design and analysis tool achieves high application performance through efficient threading, vectorization, memory use, and GPU offload on current and future Intel hardware. It supports C, C++, Fortran, DPC++, OpenMP, and Python.

Offload Advisor: Get your code ready for efficient GPU offload even before you have the hardware
Automated Roofline Analysis: See performance headroom against hardware limitations and get insights for an effective optimization roadmap
Vectorization Advisor: Enable more vector parallelism and get guidance to improve its efficiency
Threading Advisor: Model, tune, and test threading design options

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® VTune™ Profiler

Performance Analysis for Applications & Systems

Download as Part of the Toolkit

Download the Stand-Alone Version

Features

What's New in 2025.1

Get Started

What Customers Are Saying

Case Studies

Specifications

Processor:

GPUs:

Languages:

Development environments:

Host operating systems:

Target operating systems:

Compilers:

Threading analysis:

Distributed environments:

Get Help

Related Tools

Intel® Advisor