Loop Markup to Minimize Analysis Overhead
Issue
Running your target application with the
can take substantially longer than running your target application without the
.
Intel® Advisor
Intel® Advisor
Depending on an accuracy level and analyses you choose for a perspective, different overhead is added to your application execution time.
For example:
Runtime Overhead / Analysis
| Survey
| Characterization
| Dependencies
| MAP
|
---|---|---|---|---|
Target application runtime with
Intel® Advisor compared to runtime without
Intel® Advisor | 1.1x longer
| 2 - 55x longer
| 5 - 100x longer
| 5 - 20x longer
|
Solutions
Use the following techniques to skip
uninteresting
loops and analyze only
interesting
loops.
Select Loops by ID
Goal: Minimize collection overhead.
Applicable analyses:
Characterization with Trip Counts and FLOP collection enabled
, Dependencies, Memory Access Patterns.
Use when...
- You want to perform a deeper analysis on only a few loops.
- CLI environment: You cannot identify source file/line numbers, such as when you are analyzing a target application for which you do not have access to source code.
Note
: In the commands below, make sure to replace the
myApplication
with your application executable path and name
before
executing a command. If your application requires additional command line options, add them
after
the executable name.Prerequisites:
- Run a Survey analysis.
- advisorCLI environment: Identify the loop IDs for the loops of interest.advisor --report=survey --project-dir=./advi_results -- ./myApplicationIn the report, the first column is the loop IDs.
Intel® Advisor
- Set your console width appropriately to avoid line wrapping.
- Pipe your report using the appropriate truncation command if you care only about the first few report columns.
After performing the prerequisites, do one of the following:
- For Vectorization and CPU Roofline: Mark the loop(s) of interest by enabling the associated
checkbox on the
Survey Report.Then run aCharacterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. - For Offload Modeling: Go toand enter the CLI action option--select=in the<string>Other parametersfield. For example,--select=5,10,12.
- Mark the loop(s) of interest using the CLI action option--select=<string>(recommended) or--mark-up-list=when running a<string>Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. For example, with the--selectoption:advisor --collect=tripcounts --flop --project-dir=./advi_results --select=5,10,12 -- ./myApplicationThen run a Characterization with Trip Counts and FLOP collections enabled, Dependencies, or Memory Access Patterns analysis.
There are different ways to select loops is in the CLI environment:
- TheadvisorCLI action options--mark-up-list=<string>and--select=<string>merely simulate enabling a GUI
checkbox when used within -collect action. They are active only for the duration of the
--collectcommand. - The same options used withadvisorCLI action actually enable a GUI
checkbox. They are active beyond the duration of the
-mark-up-loopscommand and applies to all downstream analyses, such asCharacterization with Trip Counts and FLOP collection enabled, Dependencies, Memory Access Patterns.
Select Loops by Source File/Line Number
Minimize collection overhead.
Applicable analyses:
Characterization with Trip Counts and FLOP collection enabled
, Dependencies, Memory Access Patterns.
Use when...
- You want to perform a deeper analysis on only a few loops.
- CLI environment: You are analyzing a target application for which you have access to source code and can identify source file/line numbers.
Note
: In the commands below, make sure to replace the
myApplication
with your application executable path and name
before
executing a command. If your application requires additional command line options, add them
after
the executable name.Prerequisites:
- Run a Survey analysis.
- advisorCLI environment: If necessary, identify the source file and line number for the loops of interest.advisor --report=survey --project-dir=./advi_results -- ./myApplication
After performing the prerequisites, do one of the following:
- For Vectorization and CPU Roofline: Mark the loop(s) of interest by enabling the associated
checkbox on the Survey report.
Then run a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. - For Offload Modeling: Go toand enter the CLI action option--select=in the<string>Other parametersfield. For example,--select=foo.cpp:34,bar.cpp:192.
- Mark the loop(s) of interest using the CLI action option--select=<string>(recommended) or--mark-up-list=for a Characterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. For example, with the -select option:<string>advisor --collect=tripcounts --flop --project-dir=./advi_results --select=foo.cpp:34,bar.cpp:192 -- ./bin/myApplication
- Mark the loop(s) of interest by enabling the associated
checkbox on the
Survey Report.Then run aCharacterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis. - Mark the loop(s) of interest using theadvisorCLI action and action option--select=<string>. For example:advisor --mark-up-loops --select=foo.cpp:34,bar.cpp:192 --project-dir=./advi_results -- ./myApplicationThen run aCharacterization with Trip Counts and FLOP collection enabled, Dependencies, or Memory Access Patterns analysis.
- There is essentially no difference between selecting loops by ID and selecting loops by source file/line in the GUI environment. The difference is in theadvisorCLI environment:
- TheadvisorCLI action option--mark-up-list=<string>merely simulates enabling a GUI
checkbox; therefore it persists only for the duration of the
--collectcommand. - TheadvisorCLI action and action option--select=<string>actually enables a GUI
checkbox; therefore it persists beyond the duration of the
--mark-up-loopscommand and applies to downstream analyses, such asCharacterization with Trip Counts and FLOP collection enabled, Dependencies, and Memory Access Patterns.
Select Loops by Criteria
Goal: Minimize collection overhead.
Applicable analyses: Dependencies, Memory Access Patterns.
Use when you want to perform a deeper analysis on loops chosen by criteria instead of by human input, such as when you are running the
with a collection preset or using automated scripts.
Intel® Advisor
To implement in the
advisor
CLI environment, run the commands similar to the following one by one from the command line or create a script similar to the following examples and run it to execute the commands automatically. Use the
--select
(recommended) or
--loops
option to select loops by criteria.
Note
: In the commands below, make sure to replace the
myApplication
with your application executable path and name
before
executing a command. If your application requires additional command line options, add them
after
the executable name. For example, to analyze loop-carried dependencies in loops/functions that have the
Assumes dependency present
issue, use one of the following:
- Example 1:advisor --collect=survey --project-dir=./advi_results -- ./bin/myApplicationadvisor --collect=dependencies --project-dir=./advi_results -- ./myApplicaton
- Example 2:advisor --collect=survey --project-dir=./advi_results -- ./bin/myApplicationadvisor --collect=dependencies select="scalar,has-issue" --project-dir=./advi_results -- ./myApplicaton
Select Loops by Markup Algorithm
Goal: Minimize collection overhead.
Applicable analyses: Characterization with Trip Counts and FLOP collection enabled, Dependencies, Memory Access Patterns.
This is only applicable to the
Offload Modeling
perspective.
Use
--select=r:markup=<algorithm>
when you want to perform a deeper analysis on loops chosen by a pre-defined markup algorithm based on a programming model used and/or estimated offload profitability.
If you analyze an application that runs on a CPU, use the
gpu_generic
algorithm. This algorithm selects all potentially profitable loops/functions for additional analyses to collect more data and make sure they can be safely offloaded.
If you analyze code regions that are already offloaded and use a specific programming model, use one of the following algorithms:
- omp- Select OpenMP* loops.
- icpx -fsycl- Select SYCL loops.
- ocl- Select OpenCL™ loops.
- daal- Select Intel® oneAPI Data Analytics Library loops.
- tbb- Select Intel® oneAPI Threading Building Blocks loops.
Note
: In the commands below, make sure to replace the
myApplication
with your application executable path and name
before
executing a command. If your application requires additional command line options, add them
after
the executable name. For example, to run the
Offload Modeling
and analyze potentially profitable code regions in details:
- Example 1. Use the--select=r:markup=<algorithm>option with the--collectaction option to select loops only for the specific analysis.advisor --collect=survey --project-dir=./advi_results --static-instruction-mix -- ./myApplicationadvisor --collect=tripcounts --project-dir=./advi_results --flop --cache-simulation=single --target-device=xehpg_512xve --stacks --data-transfer=light -- ./myApplicationadvisor --collect=dependencies --filter-reductions --loop-call-count-limit=16 --select markup=gpu_generic --project-dir=./advi_results -- ./myApplicationadvisor --collect=projection --project-dir=./advi_results
- Example 2. Use the--select=r:markup=<algorithm>option with the--mark-up-loopsaction option in a separate step to select loops for all analysis executed after this command.advisor --collect=survey --project-dir=./advi_results --static-instruction-mix -- ./myApplicationadvisor --collect=tripcounts --project-dir=./advi_results --flop --cache-simulation=single --target-device=xehpg_512xve --stacks --data-transfer=light -- ./myApplicationadvisor --mark-up-loops --project-dir=./advi_results --select markup=gpu_generic -- ./myApplicationadvisor --collect=dependencies --filter-reductions --loop-call-count-limit=16 --project-dir=./advi_results -- ./myApplicationadvisor --collect=projection --project-dir=./advi_results
Currently, there is no GUI equivalent of the markup strategies. The
gpu_generic
strategy is used by default.