Intel® MPI Library Developer Reference for Linux* OS

ID 768732
Date 6/24/2024
Public
Document Table of Contents

Other Environment Variables

I_MPI_DEBUG

Print out debugging information when an MPI program starts running.

Syntax

I_MPI_DEBUG=<level>[,<flags>]

Arguments

Argument Description
<level> Indicate the level of debug information provided.
0 Output no debugging information. This is the default value.
1 Output libfabric* version and provider.
2 Output information about the tuning file used.
3 Output effective MPI rank, pid and node mapping table.
4 Output process pinning information.
5 Output environment variables specific to the Intel® MPI Library.
> 5 Add extra levels of debug information.
Argument Description
<flags> Comma-separated list of debug flags
pid Show process id for each debug message.
tid Show thread id for each debug message for multithreaded library.
time Show time for each debug message.
datetime Show time and date for each debug message.
host Show host name for each debug message.
level Show level for each debug message.
scope Show scope for each debug message.
line Show source line number for each debug message.
file Show source file name for each debug message.
nofunc Do not show routine name.
norank Do not show rank.
nousrwarn Suppress warnings for improper use case (for example, incompatible combination of controls).
flock Synchronize debug output from different process or threads.
nobuf Do not use buffered I/O for debug output.

Description

Set this environment variable to print debugging information about the application.

NOTE:
Set the same <level> value for all ranks.

You can specify the output file name for debug information by setting the I_MPI_DEBUG_OUTPUT environment variable.

Each printed line has the following format:

[<identifier>] <message>

where:

  • <identifier> is the MPI process rank, by default. If you add the '+' sign in front of the <level> number, the <identifier> assumes the following format: rank#pid@hostname. Here, rank is the MPI process rank, pid is the UNIX* process ID, and hostname is the host name. If you add the '-' sign, <identifier> is not printed at all.
  • <message> contains the debugging output.

The following examples demonstrate possible command lines with the corresponding output:

$ mpirun -n 1 -env I_MPI_DEBUG=2 ./a.out
...
[0] MPI startup(): shared memory data transfer mode

The following commands are equal and produce the same output:

$ mpirun -n 1 -env I_MPI_DEBUG=2,pid,host ./a.out
...
[0#1986@mpicluster001] MPI startup(): shared memory data transfer mode
NOTE:
Compiling with the -g option adds a considerable amount of printed debug information.

I_MPI_DEBUG_OUTPUT

Set output file name for debug information.

Syntax

I_MPI_DEBUG_OUTPUT=<arg>

Arguments

Argument Description
stdout Output to stdout. This is the default value.
stderr Output to stderr.
<file_name> Specify the output file name for debug information. The maximum file name length is 256 symbols.

Description

Set this environment variable if you want to split output of debug information from the output produced by an application. If you use format like %r, %p or %h, rank, process ID or host name is added to the file name accordingly.

I_MPI_DEBUG_COREDUMP

Controls core dump files generation in case of failure during MPI application execution.

Syntax

I_MPI_DEBUG_COREDUMP=<arg>

Arguments

Argument Description
enable|yes|on|1 Enable coredump files generation.
disable|no|off|0 Do not generate coredump files. Default value.

Description

Set this environment variable to enable coredump files dumping in case of termination caused by segmentation fault. Available for both release and debug builds.

I_MPI_STATS

Collect MPI statistics from your application using Application Performance Snapshot.

Syntax

I_MPI_STATS=<level>

Arguments

Argument Description
<level> Indicate the level of statistics collected.
1,2,3,4,5

Specify the level to indicate amount of MPI statistics to be collected by Application Performance Snapshot (APS).

The full description of levels is available in the official APS documentation.

Description

Set this variable to collect MPI-related statistics from your MPI application using Application Performance Snapshot. The variable creates a new folder aps_result_<date>-<time> containing statistics data. To analyze the collected data, use the aps utility. For example:

$ export I_MPI_STATS=5
$ mpirun -n 2 ./myApp
$ aps-report aps_result_20171231_235959

I_MPI_STARTUP_MODE

Select a mode for the Intel® MPI Library process startup algorithm.

Syntax

I_MPI_STARTUP_MODE=<arg>

Arguments

Argument Description
pmi_shm Use shared memory to reduce the number of PMI calls.
pmi_shm_netmod Use the netmod infrastructure for address exchange logic in addition to PMI and shared memory. This is the default value.

Description

The pmi_shm_netmod mode reduce the application startup time. The efficiency of the modes is more clearly observed with the higher -ppn value, while there is no improvement at all with -ppn 1.

I_MPI_PMI_LIBRARY

Specify the name to third party implementation of the PMI library.

Syntax

I_MPI_PMI_LIBRARY=<name>

Arguments

Argument Description
<name> Full name of the third party PMI library

Description

Set I_MPI_PMI_LIBRARY to specify the name of third party PMI library. When you set this environment variable, provide full name of the library with full path to it.

Currently supported PMI versions: PMI1, PMI2, and PMIx.

Example

To launch an application using Intel MPI and PMIx, you can use Cray's PALS*.

For that, you need the following environment variables:

  • I_MPI_OFI_LIBRARY=<path-to-crays>/libfabric.so.1
  • I_MPI_OFI_PROVIDER=cxi
  • I_MPI_PMI_LIBRARY=<path-to>/libpmix.so
  • I_MPI_PMI=pmix

The following example shows how to launch an application using Intel MPI and PMIx with Cray's PALS*, CXI*, and PBS*:

I_MPI_OFI_LIBRARY=<path-to-crays>/libfabric.so.1 I_MPI_OFI_PROVIDER=cxi  I_MPI_PMI_LIBRARY=<path-to>/libpmix.so  I_MPI_PMI=pmix <path-to-crays-pals>/<version>/bin/mpirun --pmi=pmix -n 2 -ppn 1 --hostfile $PBS_NODEFILE  ./myprog

I_MPI_PMI_VALUE_LENGTH_MAX

Control the length of the value buffer in PMI on the client side.

Syntax

I_MPI_PMI_VALUE_LENGTH_MAX=<length>

Arguments

Argument Description
<length> Define the value of the buffer length in bytes.
<n> > 0  The default value is -1, which means do not override the value received from the PMI_KVS_Get_value_length_max() function.

Description

Set this environment variable to control the length of the value buffer in PMI on the client side. The length of the buffer will be the lesser of I_MPI_PMI_VALUE_LENGTH_MAX and PMI_KVS_Get_value_length_max().

I_MPI_OUTPUT_CHUNK_SIZE

Set the size of the stdout/stderr output buffer.

Syntax

I_MPI_OUTPUT_CHUNK_SIZE=<size>

Arguments

Argument Description
<size> Define output chunk size in kilobytes
<n>> 0 The default chunk size value is 1 KB

Description

Set this environment variable to increase the size of the buffer used to intercept the standard output and standard error streams from the processes. If the <size> value is not greater than zero, the environment variable setting is ignored and a warning message is displayed.

Use this setting for applications that create a significant amount of output from different processes. With the -ordered-output option of mpiexec.hydra, this setting helps to prevent the output from garbling.

NOTE:
Set the I_MPI_OUTPUT_CHUNK_SIZE environment variable in the shell environment before executing the mpiexec.hydra/mpirun command. Do not use the -genv or -env options for setting the <size> value. Those options are used only for passing environment variables to the MPI process environment.

I_MPI_REMOVED_VAR_WARNING

Print out a warning if a removed environment variable is set.

Syntax

I_MPI_REMOVED_VAR_WARNING=<arg>

Arguments

Argument Description
enable | yes | on | 1 Print out the warning. This is the default value
disable | no | off | 0 Do not print the warning

Description

Use this environment variable to print out a warning if a removed environment variable is set. Warnings are printed regardless of whether I_MPI_DEBUG is set.

I_MPI_VAR_CHECK_SPELLING

Print out a warning if an unknown environment variable is set.

Syntax

I_MPI_VAR_CHECK_SPELLING=<arg>

Arguments

Argument Description
enable | yes | on | 1 Print out the warning. This is the default value
disable | no | off | 0 Do not print the warning

Description

Use this environment variable to print out a warning if an unsupported environment variable is set. Warnings are printed in case of removed or misprinted environment variables.

I_MPI_LIBRARY_KIND

Specify the Intel® MPI Library configuration.

Syntax

I_MPI_LIBRARY_KIND=<value>

Arguments

Value Description
release Multi-threaded optimized library (with the global lock). This is the default value
debug Multi-threaded debug library (with the global lock)

Description

Use this variable to set an argument for the vars.[c]shscript. This script establishes the Intel® MPI Library environment and enables you to specify the appropriate library configuration. To ensure that the desired configuration is set, check the LD_LIBRARY_PATH variable.

Example

$ export I_MPI_LIBRARY_KIND=debug

Setting this variable is equivalent to passing an argument directly to the vars.[c]sh script:

Example

$ . <installdir>/bin/vars.sh release

I_MPI_PLATFORM

Select the intended optimization platform.

Syntax

I_MPI_PLATFORM=<platform>

Arguments

Argument Description
<platform> Intended optimization platform (string value)
auto Use only with heterogeneous runs to determine the appropriate platform across all nodes. May slow down MPI initialization time due to collective operation across all nodes.
ivb Optimize for the Intel® Xeon® Processors E3, E5, and E7 V2 series and other Intel® Architecture processors formerly code named Ivy Bridge.
hsw Optimize for the Intel Xeon Processors E3, E5, and E7 V3 series and other Intel® Architecture processors formerly code named Haswell.
bdw Optimize for the Intel Xeon Processors E3, E5, and E7 V4 series and other Intel Architecture processors formerly code named Broadwell.
knl Optimize for the Intel® Xeon Phi™ processor and coprocessor formerly code named Knights Landing.
skx Optimize for the Intel Xeon Processors E3 V5 and Intel Xeon Scalable Family series, and other Intel Architecture processors formerly code named Skylake.
clx Optimize for the 2nd Generation Intel Xeon Scalable Processors, and other Intel® Architecture processors formerly code named Cascade Lake.
clx-ap Optimize for the 2nd Generation Intel Xeon Scalable Processors, and other Intel Architecture processors formerly code named Cascade Lake AP Note: The explicit clx-ap setting is ignored if the actual platform is not Intel.

Description

Set this environment variable to use the predefined platform settings. The default value is a local platform for each node.

The variable is available for both Intel and non-Intel microprocessors, but it may utilize additional optimizations for Intel microprocessors than it utilizes for non-Intel microprocessors.

NOTE:
The values auto[:min], auto:max, and auto:most may increase the MPI job startup time.

I_MPI_MALLOC

Control the Intel® MPI Library custom allocator of private memory.

Syntax

I_MPI_MALLOC=<arg>

Argument

Argument Description
1

Enable the Intel MPI Library custom allocator of private memory.

Use the Intel MPI custom allocator of private memory for MPI_Alloc_mem/MPI_Free_mem.

0

Disable the Intel MPI Library custom allocator of private memory.

Use the system-provided memory allocator for MPI_Alloc_mem/MPI_Free_mem.

Description

Use this environment variable to enable or disable the Intel MPI Library custom allocator of private memory for MPI_Alloc_mem/MPI_Free_mem.

By default, I_MPI_MALLOC is enabled if I_MPI_ASYNC_PROGRESS and I_MPI_THREAD_SPLIT are disabled.

NOTE:
If the platform is not supported by the Intel MPI Library custom allocator of private memory, a system-provided memory allocator is used and the I_MPI_MALLOC variable is ignored.

I_MPI_SHM_HEAP

Control the Intel® MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP=<arg>

Argument

Argument Description
1 Use the Intel MPI custom allocator of shared memory for MPI_Alloc_mem/MPI_Free_mem.
0 Do not use the Intel MPI custom allocator of shared memory for MPI_Alloc_mem/MPI_Free_mem.

Description

Use this environment variable to enable or disable the Intel MPI Library custom allocator of shared memory for MPI_Alloc_mem/MPI_Free_mem.

By default, I_MPI_SHM_HEAP is disabled. If enabled, it can improve performance of the shared memory transport because in that case it is possible to make only one memory copy operation instead of two copy-in/copy-out memory copy operations. If both I_MPI_SHM_HEAP and I_MPI_MALLOC are enabled, the shared memory allocator is used first. The private memory allocator is used only when required volume of shared memory is not available.

Details

By default, the shared memory segment is allocated on tmpfs file system on the /dev/shm/ mount point. Starting from Linux kernel 4.7, it is possible to enable transparent huge pages on the shared memory. If Intel MPI Library shared memory heap is used, it is recommended to enable transparent huge pages on your system. To enable transparent huge pages on /dev/shm, please contact your system administrator or execute the following command:

sudo mount -o remount,huge=advise /dev/shm

In order to use another tmpfs mount point instead of /dev/shm/, use I_MPI_SHM_FILE_PREFIX_4K, I_MPI_SH M_FILE_PREFIX_2M, and I_MPI_SHM_FILE_PREFIX_1G.

NOTE:
If your application does not use MPI_Alloc_mem/MPI_Free_mem directly, you can override standard malloc/calloc/realloc/free procedures by preloading the libmpi_shm_heap_proxy.so library:
export LD_PRELOAD=$I_MPI_ROOT/lib/libmpi_shm_heap_proxy.so
export I_MPI_SHM_HEAP=1

In this case, the malloc/calloc/realloc is a proxy for MPI_Alloc_mem and free is a proxy for MPI_Free_mem.

NOTE:

If the platform is not supported by the Intel MPI Library custom allocator of shared memory, the I_MPI_SHM_HEAP variable is ignored.

I_MPI_SHM_HEAP_VSIZE

Change the size (per rank) of virtual shared memory available for the Intel MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP_VSIZE=<size>

Argument

Argument Description
<size> The size (per rank) of shared memory used in shared memory heap (in megabytes).
>0 If shared memory heap is enabled for MPI_Alloc_mem/MPI_Free_mem, the default value is 4096.

Description

Intel MPI Library custom allocator of shared memory works with fixed size virtual shared memory. The shared memory segment is allocated on MPI_Init and cannot be enlarged later.

The I_MPI_SHM_HEAP_VSIZE=0 completely disables the Intel MPI Library shared memory allocator.

I_MPI_SHM_HEAP_CSIZE

Change the size (per rank) of shared memory cached in the Intel MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP_CSIZE=<size>

Argument

Argument Description
<size> The size (per rank) of shared memory used in Intel MPI Library shared memory allocator (in megabytes).
>0 It depends on the available shared memory size and number of ranks. Normally, the size is less than 256.

Description

Small values of I_MPI_SHM_HEAP_CSIZE may reduce overall shared memory consumption. Larger values of this variable may speed up MPI_Alloc_mem/MPI_Free_mem.

I_MPI_SHM_HEAP_OPT

Change the optimization mode of Intel MPI Library custom allocator of shared memory.

Syntax

I_MPI_SHM_HEAP_OPT=<mode>

Argument

Mode Optimization Mode
rank In this mode, each rank has its own dedicated amount of shared memory. This is the default value when I_MPI_SHM_HEAP=1
numa In this mode, all ranks from NUMA-node use the same amount of shared memory.

Description

It is recommended to use I_MPI_SHM_HEAP_OPT=rank when each rank uses the same amount of memory, and I_MPI_SHM_HEAP_OPT=numa when ranks use significantly different amounts of memory.

Usually, the I_MPI_SHM_HEAP_OPT=rank works faster than I_MPI_SHM_HEAP_OPT=numa but the numa optimization mode may consume smaller volume of shared memory.

I_MPI_WAIT_MODE

Control the Intel® MPI Library optimization for oversubscription mode.

Syntax

I_MPI_WAIT_MODE=<arg>

Arguments

Argument Description
0 Optimize MPI application to work in the normal mode (1 rank on 1 CPU). This is the default value if the number of processes on a computation node is less than or equal to the number of CPUs on the node.
1 Optimize MPI application to work in the oversubscription mode (multiple ranks on 1 CPU). This is the default value if the number of processes on a computation node is greater than the number of CPUs on the node.

Description

It is recommended to use this variable in the oversubscription mode. The mode is available for the intra and internode paths.

Additionally for the internode case, I_MPI_OFI_WAIT_MODE enables the OFI wait object for the psm3 provider for I_MPI_FABRICS=ofi scenario. In that case, the following psm3 environment variables are also set:

  • PSM3_NIC_LOOPBACK=1
  • PSM3_DEVICES=self,nic
  • FI_PSM3_YIELD_MODE=1

I_MPI_THREAD_YIELD

Control the Intel® MPI Library thread yield customization during MPI busy wait time.

Syntax

I_MPI_THREAD_YIELD=<arg>

Arguments

Argument Description
0 Do nothing for thread yield during the busy wait (spin wait). This is the default value when I_MPI_WAIT_MODE=0
1 Do the pause processor instruction for I_MPI_PAUSE_COUNT during the busy wait.
2

Do the shied_yield() system call for thread yield during the busy wait.

This is the default value when I_MPI_WAIT_MODE=1

3

Do the usleep() system call for I_MPI_THREAD_SLEEP number of microseconds for thread yield during the busy wait.

Description

I_MPI_THREAD_YIELD=0 or I_MPI_THREAD_YIELD=1 in the normal mode and I_MPI_THREAD_YIELD=2 or I_MPI_THREAD_YIELD=3 in the oversubscription mode.

I_MPI_PAUSE_COUNT

Control the Intel® MPI Library pause count for the thread yield customization during MPI busy wait time.

Syntax

I_MPI_PAUSE_COUNT=<arg>

Argument

Argument Description
>=0

Pause count for thread yield customization during MPI busy wait time.

The default value is 0. Normally, the value is less than 100.

Description

This variable is applicable when I_MPI_THREAD_YIELD=1. Small values of I_MPI_PAUSE_COUNT may increase performance, while larger values may reduce energy consumption.

I_MPI_SPIN_COUNT

Control the spin count value.

Syntax

I_MPI_SPIN_COUNT=<scount>

Argument

Argument Description
<scount> Define the loop spin count when polling fabric(s).
>=0 The default <scount> value is equal to 1 when more than one process runs per processor/core. Otherwise the value equals 2000. The maximum value is equal to 2147483647.

Description

Set the spin count limit. The loop for polling the fabric(s) spins <scount> times before the library releases the processes if no incoming messages are received for processing. Smaller values for <scount> cause the Intel® MPI Library to release the processor more frequently.

Use the I_MPI_SPIN_COUNT environment variable for tuning application performance. The best value for <scount> can be chosen on an experimental basis. It depends on the particular computational environment and application.

I_MPI_THREAD_SLEEP

Control the Intel® MPI Library thread sleep microseconds timeout for thread yield customization while MPI busy wait progress.

Syntax

I_MPI_THREAD_SLEEP=<arg>

Argument

Argument Description
>=0 Thread sleep microseconds timeout. The default value is 0. Normally, the value is less than 100.

Description

This variable is applicable when I_MPI_THREAD_YIELD=3. Small values of I_MPI_PAUSE_COUNT may increase performance in the normal mode, while larger values may increase performance in the oversubscription mode

I_MPI_EXTRA_FILESYSTEM

Control native support for parallel file systems.

Syntax

I_MPI_EXTRA_FILESYSTEM=<arg>

Argument

Argument Description
enable | yes | on | 1 Enable native support for parallel file systems.
disable | no | off | 0 Disable native support for parallel file systems. This is the default value.

Description

Use this environment variable to enable or disable native support for parallel file systems. This environment variable is deprecated.

I_MPI_EXTRA_FILESYSTEM_FORCE

Syntax

I_MPI_EXTRA_FILESYSTEM_FORCE=<ufs|nfs|gpfs|panfs|lustre|daos>

Description

Force filesystem recognition logic. Setting this variable is equivalent to prefixing all paths in MPI-IO calls with the selected filesystem plus colon. This environment variable is deprecated.

I_MPI_EXTRA_FILESYSTEM_NFS_DIRECT

Syntax

I_MPI_EXTRA_FILESYSTEM_NFS_DIRECT=<arg>

Argument

Argument Description
enable | yes | on | 1 Enable native support for parallel file systems. This is the default value.
disable | no | off | 0 Disable native support for parallel file systems.

Description

Turn on NFS bypassing cache to achieve sequential consistency among all accesses using a single file handle. This environment variable is deprecated.

I_MPI_FILESYSTEM

Turn on/off native parallel file systems support. If set, I_MPI_EXTRA_FILESYSTEM is ignored.

Syntax

I_MPI_FILESYSTEM=<arg>

Argument

Argument Description
disable | no | off | 0 Disable native support for parallel file. This is the default value.
enable | yes | on | 1 Enable native support for parallel file.

I_MPI_FILESYSTEM_FORCE

Force Intel MPI to use a specific driver for a file system. If set, I_MPI_EXTRA_FILESYSTEM_FORCE is ignored.

Syntax

I_MPI_FILESYSTEM_FORCE=<ufs|nfs|gpfs|panfs|lustre|daos>

I_MPI_FILESYSTEM_CB_NODES

Explicitly set the MPI-IO hint cb_nodes for all MPI-IO file handles, overriding user info set at runtime. Non-positive values are ignored.

Syntax

I_MPI_FILESYSTEM_CB_NODES=<arg>

Argument

Argument Description
Any positive integer Maximum number of collective I/O aggregators for all collective I/O operations.
Non-positive integer Ignored. The default value is -1.

I_MPI_FILESYSTEM_CB_CONFIG_LIST

Explicitly set the MPI-IO hint cb_config_list for all MPI-IO file handles, which overrides user information set at runtime.

Syntax

I_MPI_FILESYSTEM_CB_CONFIG_LIST=<arg>

Argument

Argument Description
"*:<proc>" Place <proc> number of I/O aggregators per node. <proc> should be a positive integer.
"" Ignored. This is the default value.

I_MPI_FILESYSTEM_NFS_DIRECT

Enable NFS bypassing cache to achieve sequential consistency among all accesses using a single file handle. If set, I_MPI_FILESYSTEM_NFS_DIRECT is ignored.

Syntax

I_MPI_FILESYSTEM_NFS_DIRECT=<arg>

Argument

Argument Description
disable | no | off | 0 Disable native support for parallel file systems.
enable | yes | on | 1 Enable native support for parallel file systems. This is the default value.

I_MPI_FILESYSTEM_GPFS_DIRECT

Enable GPFS bypassing cache to achieve sequential consistency among all accesses using a single file handle.

Syntax

I_MPI_FILESYSTEM_GPFS_DIRECT=<arg>

Argument

Argument Description
disable | no | off | 0 Disable native support for GPFS sequential consistency. This is the default value.
enable | yes | on | 1 Enable native support for GPFS sequential consistency.

I_MPI_MULTIRAIL

Syntax

I_MPI_MULTIRAIL=<arg>

Argument

Argument Description
1 Enable multi-rail capability.
0 Disable multi-rail capability. This is the default value.

Description

Set this variable to enable multi-rail capability and identify NICs serviced by the provider. Pick this variable on the same NUMA.

Syntax

I_MPI_SPAWN=<arg>

Argument

Argument Description
enable | yes | on | 1 Enable support of dynamic processes.
disable | no | off | 0 Disable support of dynamic processes. This is the default value.

Description

Use this environment variable to enable or disable dynamic processes and MPI-port support.

When dynamic processes infrastructure conflicts with optimization or require extra communication during bootstrap, this feature is disabled by default. This control is mandatory for applications that use dynamic processes.

Note

Due to limitations, MLX provider does not support MPI-port operations (e.g. MPI_Open_port, MPI_Comm_connect) out of box with I_MPI_SPAWN enabled.

To support these operations, set FI_MLX_NS_ENABLE=1.