Intel® MPI Library provides the following tuning utilities:
Autotuner is the recommended utility for the application-specific tuning. If an application is spending significant time in MPI collective operations, autotuning might improve its performance. Autotuner is easy-to-use, and its overhead is close to zero.
The autotuning utility's tuning scope is I_MPI_ADJUST_<opname> family of environment variables, which are MPI collective operation algorithms. Autotuner limits tuning to the current cluster configuration (fabric, number of ranks, number of ranks per node). It works while an application is running, so performance could be potentially improved just by enabling the autotuner. It is also possible to generate new tuning file with MPI collective operations adjusted to application needs, and this file can be further passed to the I_MPI_TUNING_BIN variable.
mpitune is useful If the search space of the autotuner is not sufficient for your needs. mpitune iteratively launches a benchmarking application with different configurations to measure performance and stores the results of each launch. Based on these results, the tuner generates optimal values for parameters that are being tuned. mpitune has an ability to search for optimal values of variables other than I_MPI_ADJUST_<opname>, and it could be used for application-specific and cluster-wide tuning. For example, it could tune parameters (like radix) of collective operations.
Differences between the tuning utilities:
|Low tuning overhead||+||-|
|Ease of use||+||-|
|Tuning beyond collective operations||-||+|