A newer version of this document is available. Customers should click here to go to the newest version.
Configuring Error Checks
You can configure manually which errors are checked: all errors have a unique name and are categorized in a hierarchy similar to functions. For example, LOCAL:MEMORY:OVERLAP is a local check which ensures that memory is not used twice in concurrent MPI operations. By disabling certain errors you can skip a report about it and reduce the checking overhead.
Use the configuration options listed below. For instructions on how to set them, see Configuring Intel® Trace Collector.
Use the CHECK configuration option to match against the names of supported errors and turn it on or off, as in the example below. See Correctness Checking Errors for the list of all errors.
# Turn all checking off: # ** matches colons # * does not CHECK ** OFF # Selectively turn on specific checks: # - All local checks CHECK LOCAL:** ON # - Only one global check CHECK GLOBAL:MSG:DATATYPE:MISMATCH ON
By default, Intel Trace Collector checks for all errors and tries to provide as much information about them as possible. In particular it does stack unwinding and reports source code information for each level in the call hierarchy. This can be controlled with the PCTRACE configuration option. For performance analysis that option is off by default, but for correctness checking with libVTmc it is enabled.
This option controls the same mechanism to detect deadlocks as in libVTfs. For interactive use it is recommended to set it to a small value like 10s to detect deadlocks quickly without having to wait long for the timeout.
Displays a GLOBAL:DEADLOCK:NO_PROGRESS warning if the time spent by MPI processes in their last MPI call exceeds the threshold specified with this option. This warning indicates a load imbalance or a deadlock that cannot be detected, which may occur when at least one process polls for progress instead of blocking inside an MPI call.
Different levels of verbosity specified with this option have the following effects:
|All extra output disabled, only error summary at the end is printed.|
|Adds a summary of configuration options as the application starts (default).|
|Adds a one-line info message at the beginning by each process with host name, process ID and the normal rank prefix. This can be useful if output is redirected into one file per process, because it identifies to which process in the parallel application the output belongs.|
|Adds internal progress messages and a dump of MPI call entry/exit with their parameters and results.|
Did you find the information on this page useful?