Skip To Main Content
Intel logo - Return to the home page
My Tools

Select Your Language

  • Bahasa Indonesia
  • Deutsch
  • English
  • Español
  • Français
  • Português
  • Tiếng Việt
  • ไทย
  • 한국어
  • 日本語
  • 简体中文
  • 繁體中文
Sign In to access restricted content

Using Intel.com Search

You can easily search the entire Intel.com site in several ways.

  • Brand Name: Core i9
  • Document Number: 123456
  • Code Name: Alder Lake
  • Special Operators: “Ice Lake”, Ice AND Lake, Ice OR Lake, Ice*

Quick Links

You can also try the quick links below to see results for most popular searches.

  • Product Information
  • Support
  • Drivers & Software

Recent Searches

Sign In to access restricted content

Advanced Search

Only search in

Sign in to access restricted content.

The browser version you are using is not recommended for this site.
Please consider upgrading to the latest version of your browser by clicking one of the following links.

  • Safari
  • Chrome
  • Edge
  • Firefox

Intel® VTune™ Profiler

Find and Fix Performance Bottlenecks Quickly and Realize All the Value of Your Hardware

Performance Analysis for Applications & Systems

Intel® VTune™ Profiler optimizes application performance, system performance, and system configuration for HPC, cloud, IoT, media, storage, and more.

  • CPU, GPU, and FPGA: Tune the entire application’s performance―not just the accelerated portion.
  • Multilingual: Profile SYCL*, C, C++, C#, Fortran, OpenCL™ code, Python*, Google Go* programming language, Java*, .NET, Assembly, or any combination of languages.
  • System or Application: Get coarse-grained system data for an extended period or detailed results mapped to source code.
  • Power: Optimize performance while avoiding power- and thermal-related throttling.
Download as Part of the Toolkit

Intel VTune Profiler is included in the Intel® oneAPI Base Toolkit, which is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.

Get It Now
Download the Stand-Alone Version

A stand-alone download of Intel VTune Profiler is available. You can download binaries from Intel or choose your preferred repository.

Download

      

Develop in the Cloud

Build and optimize oneAPI multiarchitecture applications using the latest optimized Intel® oneAPI and AI tools, and test your workloads across Intel® CPUs and GPUs. No hardware installations, software downloads, or configuration necessary. Free for 120 days with extensions possible.

Get Access

Features

Algorithm Optimization

  • Locate hot spots—the most time-consuming parts of your code.
  • Visualize hot code paths and time spent in each function and with its callees with Flame Graph.

Analyze Hot Code Paths

Analyze Hot Spots

 

Microarchitecture and Memory Bottlenecks

  • Identify the most significant hardware issues that affect the performance of your application with microarchitecture exploration analysis.
  • Pinpoint memory-access-related issues such as cache misses and high-bandwidth problems.

Code-Tuning Methods for Intel® CPU Microarchitecture

Profile a Memory-Bound Application

Accelerators and XPUs

  • Optimize GPU offload schema and data transfers for SYCL, OpenCL code, Microsoft DirectX*, or OpenMP* offload code. Identify the most time-consuming GPU kernels for further optimization.
  • Analyze GPU-bound code for performance bottlenecks caused by microarchitectural constraints or inefficient kernel algorithms.
  • Explore CPU and FPGA interactions, and FPGA use.

Optimize Software for Intel® GPUs

Profile OpenMP Offload Code on a GPU

Show more Show less

Parallelism

  • Examine how efficiently the code is threaded. Identify threading issues that impact performance.
  • Evaluate compute-intensive or throughput HPC applications for efficient CPU use, vectorization, and memory use.

Method for OpenMP Code Analysis

Schedule Overhead in Intel® oneAPI Threading Building Blocks Applications

Platform and I/O

  • Locate performance bottlenecks in I/O-intensive applications. Explore how effectively the hardware processes I/O traffic generated by external PCIe* devices or integrated accelerators.
  • See a holistic view of system behavior for long-running workloads with Platform Profiler.
  • Get a fine-grained overview for short-running workloads with System Overview.

Effective Use of Intel® Data Direct I/O Technology

Multi-Node

  • Characterize performance aspects of large-scale message passing interface (MPI) and OpenMP workloads.
  • Identify scalability issues and get recommendations for in-depth analysis.

Profile MPI Applications

Show more Show less

What's New in 2023

  • Pinpoint the specific line of code in GPU compute kernels that cause hot spots and stalls for the Intel® Data Center GPU Max Series (formerly code named Ponte Vecchio). ​
  • Support added for 4th generation Intel® Xeon® Scalable processors (formerly code named Sapphire Rapids), 13th generation Intel® Core™ processors (formerly code named Raptor Lake), and the Intel® Data Center GPU Max Series.

For a more complete and up-to-date list, see the release notes.

Get Started

Download

Get Intel VTune Profiler as a stand-alone tool or as part of the Intel oneAPI Base Toolkit.

Get Intel VTune Profiler Only

Get the Intel oneAPI Base Toolkit

System Requirements

Try It Out

Get started with Intel VTune Profiler and use an introductory code sample to see how it works.

Get Started Guide

Learn Analysis Techniques

Use these learning tools and workflows to understand and analyze performance bottlenecks in your application.

Tutorials and Videos

Intel VTune Profiler Cookbook

Profiling GPUs

GPU Optimization Workflow

Show more Show less

Documentation & Code Samples

Documentation

  • Installation Guide (All Operating Systems)
  • User Guide
  • Processor Tuning Guides
  • Release Notes
  • System Requirements
     

View All Documentation

Code Samples

Get Started with Profiling

Matrix Multiply​

Learn how to profile a code that's compliant with SYCL for CPU and GPU using Intel VTune Profiler. The sample contains three implementations of matrix multiplication using different SYCL features.

Application Profiling Tutorials

Analyze Hot Code Paths Using Flame Graphs​

Understand how you can use flame graphs to detect hot spots and hot code paths in Java workloads using a sample application.​

Profile MPI Applications​

Identify imbalances and communications issues in MPI-enabled applications.​


GPU Profiling Tutorials

Profile an OpenMP* Offload Application That Runs on a GPU​

Build and compile an OpenMP application offloaded onto an Intel GPU. Use Intel VTune Profiler to run analyses with GPU capabilities (HPC performance characterization, GPU offload, and GPU compute and media hot spots) on the OpenMP application, and then examine the results.​

Profile a SYCL* Application Running on a GPU​

Learn how to use Intel VTune Profiler to run a GPU analysis on the SYCL application and examine the results. ​

View Intel VTune Profiler Samples ​

View the Intel VTune Profiler Cookbook​

​View the oneAPI Samples Catalog

Learn how to access oneAPI code samples in a tool command line or IDE.​

Training

Basics

Boost CPU Performance [2:00]

Seven Steps to GPU Application Performance

Analyze Common Performance Bottlenecks: Linux* | Windows*

Profile Heterogeneous Computing Performance [25:33]

Configuration

Profile without Drivers

Profile Docker* Containers

Use Intel VTune Profiler Server with Microsoft Visual Studio* Code and Intel Developer Cloud

Tuning

Profile .NET Core Applications

Profile MPI Applications

Profile OpenMP Applications

Profile SYCL Applications Running on a GPU

How NUMA Affects Your Workloads [58:39]

Profile Your Game Performance

End-to-End Case Studies

How to Profile Application Performance [47:53]

Analyze Hybrid OpenMP and MPI Code

Profile Your Production Java Workloads in the Cloud [1:00:00]

 

View All Resources

Training & Events Calendar

What Customers Are Saying

"Ensuring the best possible performance of systems for our users is a top priority for us. Intel VTune Amplifier helps us do that with effective workload management."

— Dennis O’Connell, senior director of performance engineering, Verizon*

Optimize Application Performance with Powerful Profiling

"Intel VTune Profiler is an invaluable tool for identifying hotspots when optimizing code. Its user interface is easy to use and informative, quickening the pace of development. Without access to Intel VTune's line-by-line performance counters, we would never have been able to identify the reasons why our mixed-precision code was running slower than our original double-precision code."

— Dr. Perri Needham, postdoctoral researcher, Walker Molecular Dynamic Laboratory

"We recommend using Intel® MPI for best performance, and tools such as VTune Profiler and Advisor to help better understand performance optimizations and how to best migrate your workloads to the cloud."

— Ilias Katsardis, HPC solution lead, Google Cloud*

"Intel’s VTune Profiler [helped us] to analyze code performance and further enhance it to run optimally on our products."

— Won-Chul Bang, PhD, vice president and head of product strategy, Samsung Medison*

"The Application Performance Snapshot feature of Intel VTune Profiler helped us analyze HemeLB running at 96K MPI ranks on SuperMUC-NG of the Leibniz Supercomputing Centre. It was straightforward and effective in its operation and analysis output."

— Dr. Jon McCullough, University College London

"We are always looking for new methods to accelerate workloads in our data center. Our teams used Intel VTune Profiler’s flame graph feature and found it intuitive to use and practical for interpreting performance data. This tool [part of the Intel® oneAPI Base Toolkit] has become essential to optimizing code and workflows, and its ability to work across Intel CPUs and GPUs adds to our productivity and performance optimization efforts."

— Dr. Markus Rampp, head of HPC Applications Division and deputy director, Max Planck Computing & Data Facility

"We rely super heavily on Intel VTune Profiler and some of the other Intel products that are our primary way to understand performance at very large scale."

— Dan Stanzione, executive director, Texas Advanced Computing Center (TACC)

Show more Show less

Specifications

Processor:
  • 3rd generation Intel® Xeon® processor family v3 (or later)
  • 4th generation (or later) Intel® Core™ processor
GPUs:
  • Intel® UHD Graphics for 11th generation Intel processors or newer
  • Intel® Iris® Xe graphics
  • Intel® Arc™ graphics
  • Intel® Server GPU
  • Intel® Data Center GPU Flex Series
  • Intel® Data Center GPU Max Series
FPGAs:
  • Intel® Arria® 10 FPGA and Intel® Stratix® FPGA
Languages:
  • SYCL
  • C and C++
  • C#
  • Fortran
  • OpenCL code
  • Google Go programming language
  • Java
  • Python
  • .NET
Development environments:
  • Windows: Microsoft Visual Studio*
  • Linux: Eclipse*
  • Virtual machine support: Kernel-based virtual machine (KVM), Hyper-V*, VMware*
  • Container support: Docker*, Singularity*, LXC, Apache Mesos*
  • Interface: Desktop or web GUI, command line

For more information, see the system requirements.

Host operating systems:
  • Windows
  • Linux
  • macOS*
Target operating systems:
  • Windows
  • Linux
  • FreeBSD*
  • Android*
  • Wind River Linux*
  • Yocto Project*
Compilers:
  • Intel® compilers
  • Microsoft* compilers
  • GNU Compiler Collection (GCC)*
Threading analysis:
  • OpenMP
  • Intel® oneAPI Threading Building Blocks
  • Native threads
Distributed environments:
  • MPI (MPICH-based, OpenMPI)

Get Help

Your success is our success. Access these support resources when you need assistance.

  • Intel VTune Profiler Forum
  • General oneAPI Support

Related Tools

Intel® Advisor

This design and analysis tool achieves high application performance through efficient threading, vectorization, and memory use, and GPU offload on current and future Intel hardware. It supports C, C++, Fortran, DPC++, OpenMP, and Python.

  • Offload Advisor: Get your code ready for efficient GPU offload even before you have the hardware
  • Automated Roofline Analysis: See performance headroom against hardware limitations and get insights for an effective optimization roadmap
  • Vectorization Advisor: Enable more vector parallelism and get guidance to improve its efficiency
  • Threading Advisor: Model, tune, and test threading design options
  • Flow Graph Analyzer: Create, visualize, and analyze task and dependency-computation

Stay in the Know with All Things CODE

Sign up to receive the latest trends, tutorials, tools, training, and more to
help you write better code optimized for CPUs, GPUs, FPGAs, and other
accelerators—stand-alone or in any combination.

 

Sign Up
  • Features
  • Get Started
  • Documentation & Code Samples
  • Training
  • Specifications
  • Help
  • Company Overview
  • Contact Intel
  • Newsroom
  • Investors
  • Careers
  • Corporate Responsibility
  • Diversity & Inclusion
  • Public Policy
  • © Intel Corporation
  • Terms of Use
  • *Trademarks
  • Cookies
  • Privacy
  • Supply Chain Transparency
  • Site Map
  • Do Not Share My Personal Information

Intel technologies may require enabled hardware, software or service activation. // No product or component can be absolutely secure. // Your costs and results may vary. // Performance varies by use, configuration and other factors. // See our complete legal Notices and Disclaimers. // Intel is committed to respecting human rights and avoiding complicity in human rights abuses. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to a violation of an internationally recognized human right.

Intel Footer Logo