User Guide

  • 2022.2
  • 08/08/2022
  • Public Content

Tracing MPI Load Imbalance

Normally, tracing of all MPI events results in a large size of the trace file, even for relatively small applications. To reduce the trace file size, but be able to get an impression of the application bottlenecks, you can trace only the MPI functions that cause application load imbalance. That is, an MPI function is traced only if it was idle at some point of the application run, causing the imbalance. This functionality is implemented in the
You can enable source code locations tracing to identify the regions in source code that caused the imbalance (see Recording Source Location Information).
To generate an imbalance trace file, link your application with the
library, using the
option of
, or one of the methods described here. For example:
$ mpirun -n 2 -trace-imbalance ./myApp
Open the generated
file to view the results. Intel® Trace Analyzer displays only the regions of MPI idle time. As a consequence, time values for MPI functions are equal to their idle time.

Known Limitations

  • This feature is currently available on Linux* OS only.
  • Point-to-point communication patterns displayed by Intel Trace Collector may be unreliable, because the
    library skips tracing of certain functions.
  • The library traces only those MPI functions that can potentially generate load imbalance. Therefore, all non-blocking operations are not traced.
  • The library does not trace user defined events (see Tracing User Defined Events), OpenMP* regions (see Recording OpenMP* Regions Information), or system calls (see Tracing System Calls).
  • Intel Trace Collector cannot run idealization for trace files generated by

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at