collect
collect
Run the specified analysis type and collect data into a
result.
GUI Equivalent
Configure Analysis
window >
HOW
pane
Syntax
-collect
<analysis_type>
-c
<analysis_type>
Arguments
- analysis_type
- Type of performance analysis. The following analysis types and configurable knobs are supported:
- Identify performance anomalies in frequently recurring intervals of code like loop iterations. Perform fine-grained analysis at the microsecond level.
- -knob ipt-regions-to-loadto specify the maximum number (10-5000) of code regions to load for detailed analysis. To load details efficiently, maintain this number at or below 1000.
- -knob max-region-durationto specify the maximum duration (0.001-1000ms) of analysis per code region.
Collection type: user-mode sampling and tracing collection or hardware event-based sampling. - Identify your most time-consuming source code using one of the available collection modes:
- -knob sampling-mode=sw(former Basic Hotspots) to collect hotspots and stack information based on the user-mode sampling and tracing, which does not required sampling drivers but incurs higher collection overhead). This mode cannot be used to profile a system, but must either launch an application/process or attach to one.
- -knob sampling-mode=hw(former Advanced Hotspots) to sample all processes on the system and identify hotspots.
Collection type: user-mode sampling and tracing collection or hardware event-based sampling.Knobs:enable-characterization-insights,enable-stack-collection,sampling-interval,sampling-mode. - Analyze how your application is using available logical CPU cores, discover where parallelism is incurring synchronization overhead, find how waits affect your application's performance, and identify potential candidates for parallelization.Collection type: user-mode sampling and tracing collection.Knobs:sampling-interval.
- Analyze memory consumption by your Linux application, its distinct memory objects and their allocation stacks.Collection type: user-mode sampling and tracing collection.Knobs:mem-object-size-min-thres.
- Identify opportunities to optimize CPU, memory, and FPU utilization for compute-intensive or throughput applications.Collection type: hardware event-based sampling collection.Knobs:enable-stack-collection,collect-memory-bandwidth,sampling-interval,dram-bandwidth-limits.
- uarch-exploration(formely known asgeneral-exploration)
- Identify and locate the most significant hardware issues that affect the performance of your application. Use this analysis type as a starting point for microarchitecture analysis.Collection type: hardware event-based sampling collection.Knobs:enable-stack-collection,collect-memory-bandwidth,enable-user-tasks.
- Measure a set of metrics to identify memory access related issues (for example, specific for NUMA architectures).Collection type: hardware event-based sampling collection.Knobs:sampling-interval,dram-bandwidth-limits,analyze-openmp; Linux only:analyze-mem-objects,mem-object-size-min-thres.
- sgx-hotspots(deprecated)
- Analyze hotspots inside security enclaves for systems with the Intel® Software Guard Extensions (Intel® SGX) feature enabled.Collection type: hardware event-based sampling collection.Knobs:enable-stack-collection,enable-user-tasks.
- tsx-exploration(deprecated)
- Analyze Intel® Transactional Synchronization Extensions (Intel® TSX) usage.Collection type: hardware event-based sampling collection.Knobs:enable-user-tasks,analysis-step.
- tsx-hotspots(deprecated)
- Analyze hotspots inside transactions.Knobs:enable-user-tasks,enable-stack-collection.
- cpugpu-concurrency(deprecated)
- Enable the CPU/GPU Concurrency analysis and explore code execution on the various CPU and GPU cores in your system, correlate CPU and GPU activity and identify whether your application is GPU or CPU bound.Knobs:sampling-interval,enable-user-tasks,enable-user-sync,enable-gpu-usage,gpu-counters-mode,enable-gpu-runtimes.
- Identify GPU tasks with high GPU utilization and estimate the effectiveness of this utilization.Collection type: hardware event-based sampling collection.Knobs:gpu-sampling-interval,enable-gpu-usage,gpu-counters-mode,enable-gpu-runtimes,enable-stack-collection.
- gpu-profiling(deprecated)
- Analyze GPU kernel execution per code line and identify performance issues caused by memory latency or inefficient kernel algorithms.Collection type: hardware event-based sampling collection.Knobs:gpu-profiling-mode,kernels-to-profile.
- graphics-rendering(preview)
- Analyze the CPU/GPU utilization of your code running on the Xen virtualization platform. Explore GPU usage per GPU engine and GPU hardware metrics that help understand where performance improvements are possible. If applicable, this analysis also detects OpenGL-ES API calls and displays them on the timeline.Collection type: hardware event-based sampling collection.Knobs:gpu-sampling-interval,gpu-counters-mode.
- Analyze the CPU/FPGA interaction issues via exploring OpenCL kernels running on FPGA, identify the most time-consuming FPGA kernels.Collection type: hardware event-based sampling collection.Knobs:sampling-interval,enable-stack-collection.
- Monitor utilization of the IO subsystems, CPU and processor buses.Collection type: hardware event-based sampling collection.Knobs:collect-pcie-bandwidth,mmio,iommu,collect-memory-bandwidth,dram-bandwidth-limits,dpdk,spdk,kernel-stack.
- Evaluate general behavior of Linux* or Android* target systems and correlate power and performance metrics with IRQ handling.Collection type: hardware event-based sampling collection.Knobs:collection-detail.
For Android* systems,
VTune
provides GPU analysis only on processors with Intel® HD Graphics and Intel®
Iris® Graphics. You cannot view the collected results in the CLI report. To
view the results, open the result file in GUI.
Profiler
Default
- OFF
- Thecommand runs no data collection unless the collect action is specified.vtune
Modifiers
[no]-allow-multiple-runs,
[no]-analyze-system,
data-limit,
discard-raw-data,
duration,
finalization-mode,
[no]-follow-child,
knob
,
mrte-mode,
quiet,
resume-after,
return-app-exitcode,
ring-buffer,
search-dir,
start-paused,
,
strategy,
[no-]summary,
target-duration-type
,
target-pid,
target-process,
target-system,
trace-mpi,
no-unplugged-mode,
user-data-dir,
verbose
Description
Use the
collect
action to perform analysis and collect data. By
default, this process performs the specified type of analysis, collects and
finalize data into a result file, and outputs a Summary report to stdout. In
most cases you will want to use the
search-dir
action-option to specify the search directory. Some analysis types support the
knob
option, which allow you to specify additional level
settings.
There are many options that you can use to customize the
behavior of the
collect
action to suit your purposes. For example, you can
choose whether to analyze a child process only, whether to start collection
after a certain amount of time has elapsed, or whether to perform collection
without finalizing the result. There are a few examples included in this topic.
For more information, use one of the
help
commands described below, or browse or search this
documentation for information on the type of analysis you wish to perform.
To access the most current command line documentation for an action, enter
-help <, where < -help.
vtune
action
>action
> is one of the available actions. To see all available actions, enter
vtune
To view a list of analysis types supported for your
processor:
vtune
To view detailed information on the supported analysis
type:
vtune
analysis_type
>This command displays a description for the specified
analysis type and its configuration options (knobs).
Alternate Options
- collect-with
- Thecollect-withaction performs the same basic functions as thecollectaction, but provides additional knob settings for custom configuration.
Examples
This command runs the hotspots analysis in the hardware
event-based sampling mode for a Linux
myApp
application, writes
the result to the default directory, and outputs a summary report by default.
vtune -collect hotspots -knob sampling-mode=hw -- /home/test/sample
For best results, specify the search directories. This example collects a
default-named hotspots result, searching for symbol files in the
home/import/system_modules
high-priority search directory.
vtune -collect hs -search-dir /home/import/system_modules -- /home/test/sample
You can use the
target-pid
or
target-process
options to attach a Hotspots collection to a
running process. In this example,
target-pid
is used to attach the collection to a running
process whose ID is 1234.
vtune -collect hotspots -target-pid 1234
The
no-auto-finalize
action-option start a Threading analysis,
collect performance data, and exit without finalizing the result.
vtune -collect threading -no-auto-finalize -- /home/test/sample