A newer version of this document is available. Customers should click here to go to the newest version.
Top-down Microarchitecture Analysis Method
OpenMP* Code Analysis Method
Custom Data Collection for Performance Analysis (NEW)
Software Optimization for Intel® GPUs (NEW)
Core Utilization in DPDK Apps
PCIe Traffic in DPDK Apps
DPDK Event Device Profiling
Effective Utilization of Intel® Data Direct I/O Technology
Compile a Portable Optimized Binary with the Latest Instruction Set
Profiling High Bandwidth Memory Performance on Intel® Xeon® CPU Max Series (NEW)
Profiling Windows* Applications for Hybrid CPU Platforms (NEW)
Viewing Analysis Results on a Web Browser (NEW)
Profiling Machine Learning Applications (NEW)
Profiling Single-Node Kubernetes* Applications (NEW)
Analyzing Hot Code Paths Using Flame Graphs (NEW)
Improving Hotspot Observability in a C++ Application Using Flame Graphs
Measuring Performance Impact of NUMA in Multi-Processor Systems
Profiling Games built with Unity* (NEW)
Profiling Games built with Unreal Engine* (NEW)
Profiling Java Applications as a Remote User (NEW)
Profiling JavaScript* Code in Node.js*
Analyzing CPU and FPGA (Intel® Arria® 10 GX) Interaction
Profiling a .NET* Core Application
Profiling Applications in Amazon Web Services* (AWS) EC2 Instances
Enabling Performance Profiling in GitLab* CI
Configuring a Hyper-V* Virtual Machine for Hardware-Based Hotspots Analysis
Profiling an Application for Performance Anomalies (NEW)
Profiling an OpenMP* Offload Application running on a GPU
Profiling a SYCL* Application running on a GPU
Profiling an FPGA-driven SYCL* Application
Profiling Hardware Without Intel Sampling Drivers
Profiling MPI Applications
Profiling Docker* Containers
Profiling a Remote Target Through a Proxy Server (NEW)
Profiling in an Apptainer* Container
Profiling Linux*, Android*, and QNX* System Boot Time
Using Intel® VTune™ Profiler Server with Visual Studio Code and Intel® DevCloud for oneAPI (NEW)
Using Intel® VTune™ Profiler Server in HPC Clusters
Using the Command-Line Interface to Analyze the Performance of a SYCL* Application running on a GPU (NEW)
Cache-Related Latency Issues in Segmented Cache Environment
False Sharing
Frequent DRAM Accesses
Poor Port Utilization
Page Faults
Instruction Cache Misses
Inefficient TCP/IP Synchronization
OS Thread Migration
OpenMP* Imbalance and Scheduling Overhead
Processor Cores Underutilization: OpenMP* Serial Time
Scheduling Overhead in an Intel® oneAPI Threading Building Blocks Application
PMDK Application Overhead