Intel® Trace Analyzer and Collector User and Reference Guide

ID 767272
Date 10/31/2024
Public

Visible to Intel only — GUID: GUID-1F6DE711-3B1E-4EF7-95C9-5503FA7DF855

Document Table of Contents

Tracing Failing MPI Applications

Normally, if an MPI application fails or is aborted, all the trace data collected is lost, because libVT needs a working MPI to write the trace file. However, the user might want to use the data collected up to that point. To solve this problem, Intel® Trace Collector provides the libVTfs library that enables tracing of failing MPI applications.

Usage Instructions

To trace failing MPI applications, do the following:

Linux* OS

Set the LD_PRELOAD environment variable to point to the libVTfs library and run the application. For example:

$ export LD_PRELOAD=libVTfs.so
$ mpirun -n 4 ./myApp

Alternatively, rebuild your application with the static version of the library. For example:

$ mpiicc -profile=vtfs myApp.c -o myApp

Windows* OS

Relink your application with the libVTfs library before the Intel MPI Library and run it as usual. To do this, you should create an Intel® MPI Library configuration file that points to the libVTfs library. You can do it as follows (administrator privileges may be required):

> echo SET PROFILE_PRELIB=%VT_ROOT%\lib\VTfs.lib > %I_MPI_ROOT%\lib\VTfs.conf
> mpiicc -profile=VTfs myApp.c
> mpiexec -n 4 myApp.exe

How it Works

Under normal circumstances tracing works like with libVT, but communication during trace file writing is done through TCP sockets, so it may take more time than over MPI. In order to establish communication, it needs to know the IP addresses of all the hosts involved. It finds them by looking up the hostname locally on each machine or, if that only yields the 127.0.0.1 local host IP address, falls back to broadcasting hostnames. In the latter case hostname lookup must work consistently in the cluster. In case of a failure, libVTfs freezes all MPI processes and then writes a trace file with all trace data.

Possible Failures

Failure Description
Signals

Includes events inside the application like segmentation faults and floating point errors, and also abort signals sent from outside, like SIGINT or SIGTERM.

Only SIGKILL will abort the application without writing a trace because it cannot be caught.

Premature Exit One or more processes exit without calling MPI_Finalize().
MPI Errors MPI detects certain errors itself, like communication problems or invalid parameters for MPI functions.
Deadlocks

If Intel® Trace Collector observes no progress for a certain amount of time in any process, it assumes that a deadlock has occurred, stops the application and writes a trace file.

You can configure the timeout with DEADLOCK-TIMEOUT. "No progress" is defined as "inside the same MPI call". This is only a heuristic and may fail to lead to both false positives and false negatives.

Undetected Deadlock

If the application polls for a message that cannot arrive with MPI_Test() or a similar non-blocking function, Intel® Trace Collector still assumes that progress is made and does not stop the application.

To avoid this, use blocking MPI calls in the application, which is also better for performance.