Use Partially Parallel Programs with
Intel® Advisor
Intel® Advisor
Intel® Advisor
serial
programs. If you have a partially parallel program,
before
you use the
Intel® Advisor
Run Your Program as a Serial Program
To run the current version of your program as a serial program, you need to limit the number of threads to 1. To run your program with a single thread:
- WithIntel® oneAPI Threading Building Blocks (oneTBB), in the main thread create atbb::task_scheduler_init init(1);object for the lifetime of the program and run the executable again. For example:int main() { tbb::task_scheduler_init init(1); // ...rest of program... return 0; }The effect oftask_scheduler_initapplies separately to each user-created thread. So if the program creates threads elsewhere, you need to create atbb::task_scheduler_init init(1);for that thread's lifetime as well. Use of certainoneTBBfeatures can prevent the program from running serially. For more information, see theoneTBBdocumentation.
- With OpenMP*, do one of the following:
- Set the OpenMP* environment variableOMP_NUM_THREADSto 1 before you run the program.
- Omit the compiler option that enables recognition of OpenMP pragmas and directives. On Windows* OS, omit/Qopenmp, and on Linux* OS omit-openmp.
For more information, see your compiler documentation.
Add or Remove
Intel® Advisor Annotations
Intel® Advisor
Intel® Advisor
Intel® Advisor
nqueens_cilk.cpp
:
... ANNOTATE_SITE_BEGIN(solve); cilk_for(int i=0; i<size; i++) { // try all positions in first row using separate array for each recursion ANNOTATE_ITERATION_TASK(setQueen); int * queens = new int[size]; setQueen(queens, 0, i); } ANNOTATE_SITE_END();
If needed, you can comment out annotations, or add preprocessor directives by using conditional compilation. For example, use the
#ifdef
,
#ifndef
, and
#endif
preprocessor directives:
... // Comment out the next line to hide the annotations. #define ANNOTATE_ON . . . #ifdef ANNOTATE_ON ANNOTATE_SITE_BEGIN(solve); #endif #ifndef ANNOTATE_ON // add parallel code here . . . #ifdef ANNOTATE_ON ANNOTATE_SITE_END(); #endif ...
After you add the parallel framework code and test it, you can remove the annotations.
Effect of Parallel Code on
Intel® Advisor Tools' Reports
Intel® Advisor
Because
tools are designed to collect data and analyze
Intel® Advisor
serial
program targets.
Parallel code that creates one or more threads within any annotated parallel site usually cause the Suitability or Dependencies tool reports to contain unreliable data. To use these two tools, there must be only a single thread within each parallel site. Also, when using parallel frameworks that use dynamic scheduling or work stealing at run-time, execution times can be assigned to the wrong source code.
If you use the Survey tool to profile your program, the
's purpose is to analyze serial code, some of the time used by parallel code may be added to the wrong places. For example,
Self Time
in the Survey Report shows the sum of the CPU time for all threads. However, because
Intel® Advisor
Self Time
may be added to the parallel framework run-time system entry points instead of the caller(s) in the thread that entered the parallel region. Also in the Survey Report, when examining parallel code, some entry points may be parallel framework run-time system entry points instead of the expected functions or loops. Similarly, in the Survey Source window, for a parallel code region the
Total Time
(and
Loop Time
) shows the sum of the CPU time for all threads.
Because
's purpose is to analyze serial code, in the Suitability Report:
Intel® Advisor
- assumes there is only a single thread (no parallelism) within any annotated parallel site, including its task(s) and lock(s). When only a single thread executes within a parallel site (as expected), the results forIntel® Advisorthat sitemay be correct. If the application has multiple parallel sites, and one or more sites were executed by multiple threads, the next two items apply.
- If multiple threads execute withinanyparallel site, the reportedMaximum Program Gainand that site'sImpact on Program Gainvalues are not reliable. To obtain correct values, ensure that only a single thread executes for all parallel sites (see Run Your Program as a Serial Program above).
- If multiple threads execute within a parallel site, the results for that site will be unpredictable and its values will not be reliable. Also, if one thread executes the parallel site annotations and a second thread executes the task annotation(s), the site may appear to not have any tasks and the tasks may appear to not execute within a site. To obtain correct values, ensure that only a single thread executes within each parallel site (see Run Your Program as a Serial Program above).
- Any work-stealing constructs within the site will cause extra time to be added to the suspended site and/or task. All Suitability Report times are approximate.
Similarly in the Dependencies Report, if any parallel site uses multiple threads, this may prevent certain problems from being detected and reported by the Dependencies tool. To obtain correct values, ensure that only a single thread executes within each parallel site (see Run Your Program as a Serial Program above).