User Guide

Contents

Run
Vectorization and Code Insights
Perspective from Command Line

Vectorization and Code Insights
perspective includes several analyses that you can run depending on the desired result. The main analysis is the Survey, which collects performance data for loops and functions in your application and identifies under-vectorized and non-vectorized loops/functions. The Survey analysis is enough to get the basic insights about your application performance.

Prerequisites

Set
Intel Advisor
environment variables
with an automated script to enable the command line interface (CLI).

Run
Vectorization and Code Insights
Perspective

Info
: In the commands below, make sure to replace the
myApplication
with your application executable path and name
before
executing a command. If your application requires additional command line options, add them
after
the executable name.
  1. Run the Survey analysis.
    advisor --collect=survey --project-dir=./advi_results -- ./myApplication
  2. Run the Characterization analysis to collect trip counts and FLOP data:
    advisor --collect=tripcounts --flop --stacks --project-dir=./advi_results -- ./myApplication
  3. Optional
    : Run the Memory Access Patterns analysis for loops/functions with the
    Possible Inefficient Memory Access Patter
    issue.
    advisor --collect=map --select=has-issue --project-dir=./advi_results -– ./myApplication
  4. Optional
    : Run the Dependencies analysis to check for loop-carried dependencies in loops/functions with
    Assumed dependency present
    issue:
    advisor --collect=dependencies --project-dir=./advi_results --select=has-issue -- ./myApplication
You can view the results in the Intel Advisor graphical user interface (GUI), print a summary to a command prompt/terminal, or save to a file. See View the Results below for details.
Analysis Details
The
Vectorization and Code Insights
workflow includes the following analyses:
  1. Survey to collect initial performance data.
  2. Characterization with trip counts and FLOP data to collect additional performance details.
  3. Memory Access Patterns (optional) to identify memory traffic data and memory usage issues.
  4. Dependencies (optional) to identify loop-carried dependencies.
Each analysis has a set of additional options that modify its behavior and collect additional performance data. The more analyses you run and option you use, the more useful data about your application you get.
Consider the following options:
Characterization Options
To run the Characterization analysis, use the following command line action:
--collect=tripcounts
.
Recommended action options:
Options
Description
--flop
Collect data about floating-point and integer operations, memory traffic, and mask utilization metrics for AVX-512 platforms.
--stacks
Enable advanced collection of call stack data.
--enable-cache-simulation
Model CPU cache behavior on your target application.
--cache-config
=
<config>
Set the cache hierarchy to collect modeling data for CPU cache behavior. Use with
enable-cache-simulation
.
The value should follow the template: [
<num_of_caches>
]:[
<num_of_ways_caches_connected>
]:[
<cache_size>
]:[
<cacheline_size>
] for each of three cache levels separated with a
/
.
--cachesim-associativity
=
<num>
Set the cache associativity for modeling CPU cache behavior: 1 | 2 | 4 | 8 (default) | 16. Use with
enable-cache-simulation
.
--cachesim-mode
=
<mode>
Set the focus for modeling CPU cache behavior:
cache-misses
|
footprint
|
utilization
. Use with
enable-cache-simulation
.
See advisor Command Option Reference for more options.
Memory Access Patterns Options
The Memory Access Patterns analysis is optional because it adds a high overhead. To run the Memory Access Patterns analysis, use the following command line action:
--collect=map
.
Recommended action options:
Options
Description
--select=
<string>
Select loops for the analysis by loop IDs, source locations, or criteria such as
scalar
,
has-issue
, or
markup=
<markup-mode>
. This option is required.
See select for more selection options.
--enable-cache-simulation
Model CPU cache behavior on your target application.
--cachesim-cacheline-size
=
<num>
Set the cache line size (in bytes) for modeling CPU cache behavior: 4 | 8 | 16 | 32 | 64 (default) | 128 | 256 | 512 | 1024 | 2048 | 4096 | 8192 | 16384 | 32768 | 65536. Use with
enable-cache-simulation
.
--cachesim-sets
=
<num>
Set the cache set size (in bytes) for modeling CPU cache behavior: 256 | 512 | 1024 | 2048 | 4096 (default) | 8192. Use with
enable-cache-simulation
.
See advisor Command Option Reference for more options.
Dependencies Options
The Dependencies analysis is optional because it adds a high overhead and is mostly necessary if you have scalar loops/functions in your application. To run the Dependencies analysis, use the following command line action:
--collect=dependencies
.
Recommended action options:
Options
Description
--select=
<string>
Select loops for the analysis by loop IDs, source locations, criteria such as
scalar
,
has-issue
, or
markup=
<markup-mode>
. This option is required.
See select for more selection options.
--filter-reductions
Mark all potential reductions with a specific diagnostic.
See advisor Command Option Reference for more options.

View the Results

Intel Advisor
provides several ways to view the
Vectorization and Code Insights
results.
View Result in CLI
You can print the results collected in the CLI and save them to a
.txt
,
.csv
, or
.xml
file.
For example, to generate the Survey report:
advisor --report=survey --project-dir=./advi_results
You should see a similar result:
ID Function Call Sites Total Self Type Why No Vectorization Vector ISA Compiler Average Min Max Call Count Transformations Source Location Module and Loops Time Time Estimated Gain Trip Count Trip Count Trip Count __________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________ 14 [loop in main at mmult_serial.cpp:79] 0.495s 0.495s Vectorized Versions 1 vectorization possible but seems inefficient... SSE2 <2.42x 127; 127; 1; 7 127; 127; 1; 7 128; 128; 1; 7 524252; 524324; 530432; 530432 Interchanged; Unrolled mmult_serial.cpp:79 1_mmult_serial.exe 6 -[loop in main at mmult_serial.cpp:79] 0.275s 0.275s Vectorized (Body) SSE2 2.42x 127 127 128 524252 Unrolled; Interchanged mmult_serial.cpp:79 1_mmult_serial.exe 3 -[loop in main at mmult_serial.cpp:79] 0.205s 0.205s Vectorized (Body) SSE2 2.42x 127 127 128 524324 Unrolled; Interchanged mmult_serial.cpp:79 1_mmult_serial.exe 7 -[loop in main at mmult_serial.cpp:79] 0.015s 0.015s Peeled 1 1 1 530432 Interchanged mmult_serial.cpp:79 1_mmult_serial.exe 11 -[loop in main at mmult_serial.cpp:79] 0s 0s Remainder vectorization possible but seems inefficient... 7 7 7 530432 Interchanged mmult_serial.cpp:79 1_mmult_serial.exe 4 [loop in main at mmult_serial.cpp:79] 0.510s 0.015s Scalar inner loop was already vectorized 1024 1024 1024 1024 Interchanged mmult_serial.cpp:79 1_mmult_serial.exe 12 [loop in main at mmult_serial.cpp:79] 0.510s 0s Scalar Versions 1 inner loop was already vectorized 1024 1024 1024 1 mmult_serial.cpp:79 1_mmult_serial.exe 5 -[loop in main at mmult_serial.cpp:79] 0.510s 0s Scalar inner loop was already vectorized 1024 1024 1024 1 mmult_serial.cpp:79 1_mmult_serial.exe
The result is also saved into a text file
advisor-survey.txt
located at
./advi_results/e
NNN
/hs
NNN
.
You can generate a report for any analysis you run. The
generic
report command looks as follows:
advisor --report=
<analysis-type>
--project-dir=
<project-dir>
--format=
<format>
where:
  • <analysis-type>
    is the analysis you want to generate the results for. For example,
    survey
    for the Survey report,
    top-down
    for the Survey report in a top-down view,
    map
    for the Memory Access Patterns, or
    dependencies
    for the Dependencies report.
  • --format=
    <format>
    is a file format to save the results to.
    <format>
    is
    text
    (default),
    csv
    ,
    xml
    .
You can also generate a report with the data from all analyses run and save it to a CSV file with the
--report=joined
action as follows:
advisor --report=joined --report-output=
<path-to-csv>
where
--report-output=<path-to-csv>
is a path and a name for a
.csv
file to save the report to. For example,
/home/report.csv
. This option is required to generate a joined report.
View Result in GUI
When you run
Intel Advisor
CLI, a project is created automatically in the directory specified with
--project-dir
. All the collected results and analysis configurations are stored in the
.advixeproj
project, that you can view in the
Intel Advisor
.
To open the project in GUI, you can run the following command:
advisor-gui <project-dir>
If the report does not open, click
Show Result
on the Welcome pane.
You first see a Vectorization Summary report that includes the overall information about vectorized and not vectorized loops/functions in your code and the vectorization efficiency, including:
  • Performance metrics of your program and the speedup for the vectorized loops/functions
  • Top five time-consuming loops and top five optimization recommendations with the highest confidence
Vectorization summary report
Save a Read-only Snapshot
A snapshot is a read-only copy of a project result, which you can view at any time using the
Intel Advisor
GUI. To save an active project result as a read-only snapshot:
advisor --snapshot --project-dir=
<project-dir>
[--cache-sources] [--cache-binaries] --
<snapshot-path>
where:
  • --cache-sources
    is an option to add application source code to the snapshot.
  • --cache-binaries
    is an option to add application binaries to the snapshot.
  • <snapshot-path
    is a path and a name for the snapshot. For example, if you specify
    /tmp/new_snapshot
    , a snapshot is saved in a
    tmp
    directory as
    new_snapshot.advixeexpz
    . You can skip this and save the snapshot to a current directory as
    snapshot
    XXX
    .advixeexpz
    .
To open the result snapshot in the
Intel Advisor
GUI, you can run the following command:
advisor-gui
<snapshot-path>
You can visually compare the saved snapshot against the current active result or other snapshot results.

Next Steps

Continue to Find Loops that Benefit from Better Vectorization to understand the results. For details about the metrics reported, see CPU and Memory Metrics.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.