Intel® Trace Analyzer and Collector User and Reference Guide

ID 767272
Date 10/31/2024
Public
Document Table of Contents

Running with Valgrind* (Linux* OS)

For distributed memory checking (LOCAL:MEMORY:INITIALIZATION) and detecting illegal accesses to memory owned by MPI (LOCAL:MEMORY:ILLEGAL_ACCESS) it is necessary to run all MPI processes under control of the Valgrind* memory checker (version 3.2.0 or higher). See http://www.valgrind.org/ for more information.

To run Valgrind, invoke it directly on the main MPI process and add the mpirun -l option. This way all output printed by Valgrind is automatically prefixed with the MPI process rank. Intel® Trace Collector detects that -l is in effect and then leaves adding the rank prefix to mpirun also for Intel Trace Collector output.

The LOCAL:MEMORY:ILLEGAL_ACCESS check causes Valgrind reports not only for illegal application accesses (as desired) but also for Intel MPI Library own access to the locked memory (not desired, because MPI currently owns it and must read or write it). These reports are normal and the Valgrind suppression file in Intel Trace Collector lib folder tells Valgrind to not print them, but Valgrind must be notified about it through its --suppressions option.

When the MPI executable is given on the command line, an MPI application could be started under Valgrind like this:

$ mpirun -check_mpi -l -n <num procs>
$ valgrind --suppressions=$VT_LIB_DIR/impi.supp <application> 
...

When a wrapper script is used, then it might be possible to trace through the wrapper script by adding the --trace-children=yes option, but that could lead to reports about the script interpreter and other programs, so adding Valgrind to the actual invocation of the MPI binary is easier.