Intel® Trace Analyzer and Collector User and Reference Guide

ID 767272
Date 3/31/2023
Public
Document Table of Contents

Configuring Error Checks

You can configure manually which errors are checked: all errors have a unique name and are categorized in a hierarchy similar to functions. For example, LOCAL:MEMORY:OVERLAP is a local check which ensures that memory is not used twice in concurrent MPI operations. By disabling certain errors you can skip a report about it and reduce the checking overhead.

Use the configuration options listed below. For instructions on how to set them, see Configuring Intel® Trace Collector.

CHECK

Use the CHECK configuration option to match against the names of supported errors and turn it on or off, as in the example below. See Correctness Checking Errors for the list of all errors.

# Turn all checking off:
# ** matches colons
# * does not
CHECK ** OFF
# Selectively turn on specific checks:
# - All local checks
CHECK LOCAL:** ON
# - Only one global check
CHECK GLOBAL:MSG:DATATYPE:MISMATCH ON

PCTRACE

By default, Intel Trace Collector checks for all errors and tries to provide as much information about them as possible. In particular it does stack unwinding and reports source code information for each level in the call hierarchy. This can be controlled with the PCTRACE configuration option. For performance analysis that option is off by default, but for correctness checking with libVTmc it is enabled.

DEADLOCK-TIMEOUT

This option controls the same mechanism to detect deadlocks as in libVTfs. For interactive use it is recommended to set it to a small value like 10s to detect deadlocks quickly without having to wait long for the timeout.

DEADLOCK-WARNING

Displays a GLOBAL:DEADLOCK:NO_PROGRESS warning if the time spent by MPI processes in their last MPI call exceeds the threshold specified with this option. This warning indicates a load imbalance or a deadlock that cannot be detected, which may occur when at least one process polls for progress instead of blocking inside an MPI call.

VERBOSE

Different levels of verbosity specified with this option have the following effects:

Level Effect
0
All extra output disabled, only error summary at the end is printed.
1
Adds a summary of configuration options as the application starts (default).
2
Adds a one-line info message at the beginning by each process with host name, process ID and the normal rank prefix. This can be useful if output is redirected into one file per process, because it identifies to which process in the parallel application the output belongs.
3
Adds internal progress messages and a dump of MPI call entry/exit with their parameters and results.