Intel® MPI Library Release Notes for Linux* OS

ID 763883
Updated 11/26/2024
Version 2021.14.1
Public

author-image

By

Overview

Intel® MPI Library for Linux OS* is a high-performance interconnect-independent multi-fabric library implementation of the industry-standard Message Passing Interface, v4.0 (MPI-4.0).

To receive technical support and updates, you need to register your product copy. See Technical Support below.

Key Features

This release of the Intel(R) MPI Library supports the following major features:

  • MPI-1, MPI-2.2, MPI-3.1 and MPI 4.0
  • Interconnect independence
  • C, C++, Fortran 77, Fortran 90, and Fortran 2008 language bindings
  • Amazon* AWS/EFA, Google* GCP support
  • Intel GPU pinning support
  • Intel and Nvidia GPU buffers support
  • PMIx Support

Product Contents

  • The Intel® MPI Library Runtime Environment (RTO) contains the tools you need to run programs including scalable process management system (Hydra), supporting utilities, and shared (.so) libraries.
  • The Intel® MPI Library Development Kit (SDK) includes all of the Runtime Environment components and compilation tools: compiler wrapper scripts, include files and modules, static (.a) libraries, debug libraries, and test codes.

You can redistribute the library under conditions specified in the License.

What's New

Intel® MPI Library 2021 Update 14.1

  • Fixed GPU cached buffers deallocation issue during cache invalidation.
  • Bug fixes.

Intel® MPI Library 2021 Update 14.0

  • Intel® Xeon® 6 Tuning and optimizations for both scale out and scale up. Improved CPU pinning library for optimized balancing for asymmetric CPU topologies.
  • MPI 4.0 compliance: Support for Partition communication, Improved Error handling, Fortran 2008 support
  • MPI_Allreduce and MPI_Reduce scale-up and scale-out optimizations for Intel GPUs
  • OFI and provider update to latest open-source versions
  • Bug fixes

Intel® MPI Library 2021 Update 13.1

  • Bug fixes

Intel® MPI Library 2021 Update 13

  • NIC Pinning support for Threads
  • MPI 4.0 : Support for MPI IO Large Count
  • Intel GPU aware bcast optimizations
  • MPI-3 RMA for Intel GPU: Performance Optimizations for peer to peer device initiated communication
  • 5th Gen Intel® Xeon® Scalable Processors Platform Tuning on InfiniBand for OFI/MLX Provider
  • Tuning for Cloud Providers – GCP on C3 instances
  • Intel MPI Thread Split Intra node optimizations
  • Xeon 6E - Supported out of box

Intel® MPI Library 2021 Update 12.1

  • Bug fix for missed symbols (MPI_Status_c2f, MPI_Status_f2c)

Intel® MPI Library 2021 Update 12

  • MPI 4.0: Persistent Collectives, Large counts – RMA and Derived datatypes
  • GPU RMA –  Performance optimizations for device initiated communication 
  • Optimization for GPU collectives 
  • Improved performance on Amazon Web Services* (OFI/efa) on ICX
  • New control variables for NIC pinning/assignment: I_MPI_OFI_NIC_LIST and I_MPI_OFI_NIC_AFFINITY
  • For the 2021.12 release, the Third Party Programs file has been included as a section in this product’s release notes rather than as a separate text file.
  • Bug fixes

Intel® MPI Library 2021 Update 11

  • MPI 4.0 Sessions.
  • Nvidia* GPUs support (I_MPI_OFFLOAD, I_MPI_OFFLOAD_MODE, I_MPI_OFFLOAD_CUDA_LIBRARY)
  • Performance optimizations for GPU collectives and point-to-point operations with small message sizes
  • MPI GPU RMA (host and gpu initiated modes)
  • New NIC assignment infrastructure.
  • PMIx Support
  • Directory Layout
    Directory layout is improved across all products to streamline installation and setup.
    The Unified Directory Layout is implemented in 2024.0. If you have multiple toolkit versions installed, the Unified layout ensures that your development environment contains the correct component versions for each installed version of the toolkit.
    The directory layout used before 2024.0, the Component Directory Layout, is still supported on new and existing installations.
    For detailed information about the Unified layout, including how to initialize the environment and advantages with the Unified layout, refer to Use the setvars and oneapi-vars Scripts with Linux.

  • Bug fixes

Intel® MPI Library 2021 Update 10

  • Intel MPI performance optimization for new Xeon platforms
  • Intel MPI Performance optimization with Intel GPU & Infiniband
  • New control variable for GPU pinning: I_MPI_OFFLOAD_PIN must be used instead of I_MPI_OFFLOAD_TOPOLIB
  • New wrapper scripts for LLVM based compilers (mpiicx, mpiicpx, mpiifx)
  • Large counts support for ILP64 (point-to-point, collectives)
  • Waitmode support (Tech preview)
  • Bug fixes.

Intel® MPI Library 2021 Update 9

  • Optimizations for GPU collectives with small message sizes
  • Optimizations for pinning for Hybrid CPUs with P-cores and E-cores
  • MPI 4.0 - big counts (Tech Preview, C interface, collectives only) 
  • Bug fixes.

Intel® MPI Library 2021 Update 8

  • Intel® MPI library provides initial support for the Intel® Data Center GPU MAX Series (formerly code-named Ponte Vecchio) utilizing XE Link for direct GPU to GPU communications
  • Intel® MPI Library speeds cluster application performance by utilizing the new embedded Data Streaming Accelerator in 4th Generation Xeon Scalable Processors (formerly code-named Sapphire Rapids).
  • Intel® MPI Library 2021.8 has implemented performance optimizations for Intel GPUs and Intel® Xeon® CPU Max Series.
  • Bug fixes.

Intel® MPI Library 2021 Update 7.1

  • Intel® MPI Library 2021.7.1 has been updated to include functional and security updates. Users should update to the latest version as it becomes available.

Intel® MPI Library 2021 Update 7

  • Bug fixes.

Intel® MPI Library 2021 Update 6

  • Get better resource planning/control at an application level with GPU pinning, important for controlling multiple ranks offloading to the GPU simultaneously.
  • Improve your application's internode communication bandwidth and/or reduce the latency between processor and node with multi-rail support.
  • Bug fixes.

Intel® MPI Library 2021 Update 5

  • Improved performance on Google Cloud Platform*(OFI/tcp) and Amazon Web Services*(OFI/efa)
  • Converged release and release_mt libraries. All features previously available in release_mt only are available in release library
  • Bug fixes

Intel® MPI Library 2021 Update 4

  • Improved performance on Google Cloud Platform*
  • Improved startup time
  • Bug fixes

Intel® MPI Library 2021 Update 3.1

  • Improved stability and performance for Amazon Web Services* (OFI/efa) and Google Cloud Platform* (OFI/tcp)
  • Bug fixes

Intel® MPI Library 2021 Update 3

  • Added 3rd Generation Intel® Xeon® Scalable Processors support
  • Performance improvements for Mellanox* ConnectX®-6 (HDR) and Intel® Ethernet 800 Series
  • Added OFI/psm3 integration
  • Bug fixes

Intel® MPI Library 2021 Update 2

  • Tuning performance on Intel® Ethernet 800 Series Network Adapters
  • Performance and stability improvements for OFI/tcp provider 
  • Spawn stability improvements
  • Mellanox* OFED 5.2 support
  • Technology Preview. Extended support for Singularity containers for IBM* Spectrum* LSF*, SLURM
  • Bug fixes

Intel® MPI Library 2021 Update 1

  • Amazon* AWS/EFA, Google* GCP support enhancements
  • Intel GPU pinning support (I_MPI_OFFLOAD_TOPOLIB, I_MPI_OFFLOAD_DOMAIN_SIZE, I_MPI_OFFLOAD_CELL, I_MPI_OFFLOAD_DEVICES, I_MPI_OFFLOAD_DEVICE_LIST, I_MPI_OFFLOAD_DOMAIN)
  • Intel GPU buffers support (I_MPI_OFFLOAD)
  • Optimizations for Intel® Xeon® Platinum 9282/9242/9222/9221 family
  • Mellanox* ConnectX*-3/4/5/6 (FDR/EDR/HDR) support enhancements
  • Distributed Asynchronous Object Storage (DAOS) file system support
  • mpitune_fast functionality improvements
  • PMI2 spawn support
  • Bug fixes

Intel® MPI Library 2019 Update 12

  • Bug fixes

Intel® MPI Library 2019 Update 11

  • Added Mellanox* OFED 5.2 support
  • Bug fixes

Intel® MPI Library 2019 Update 10

  • Performance optimizations for Intel® Ethernet 800 Series
  • Enabled Message Queue Support API. (TotalView* HPC Debugging Software message queue support)
  • Bug fixes

Intel® MPI Library 2019 Update 9

  • MPI_Comm_accept/connect/join support for Mellanox* provider
  • mpitune_fast functionality improvements
  • Intel® Ethernet 800 Series support
  • Intel GPU buffers support enhancements (I_MPI_OFFLOAD) (technical preview)
  • I_MPI_ADJUST_SENDRECV_REPLACE optimization
  • oneAPI compiler support in mpicc/mpif90/mpif77 wrappers
  • Fixed MPI-IO operations on LUSTRE filesystem for files larger than 2 GB
  • Bug fixes

Intel® MPI Library 2019 Update 8

  • Infiniband* support enhancements for all supported platforms
  • Amazon* AWS/EFA, Google* GCP support enhancements
  • Intel GPU pinning support (I_MPI_OFFLOAD_TOPOLIB, I_MPI_OFFLOAD_DOMAIN_SIZE, I_MPI_OFFLOAD_CELL, I_MPI_OFFLOAD_DEVICES, I_MPI_OFFLOAD_DEVICE_LIST, I_MPI_OFFLOAD_DOMAIN) (technical preview)
  • Distributed Asynchronous Object Storage (DAOS) file system support
  • Intel® Xeon® Platinum 9282/9242/9222/9221 family optimizations and platform recognition
  • ILP64 support improvements
  • PMI2 spawn support
  • impi_info tool extensions (-e|-expert option)
  • Bug fixes

Intel® MPI Library 2019 Update 7

  • Performance optimizations for Intel® Xeon® Platinum 9200 (formerly Cascade Lake-AP)
  • Implemented dynamic processes support in OFI/mlx provider
  • Added integrity checks for parameters of Fortran ILP64 interface in debug library
  • Added PMI2 support
  • Fixed issue with MPI_Allreduce at large scale
  • Fixed issue with MPI-IO operations on GPFS
  • Fixed issue with MPI-IO with 2+ GiB files on NFS
  • Bug fixes

Intel® MPI Library 2019 Update 6

  • Improved Mellanox* Infiniband* EDR/HDR interconnect support
  • Improved Amazon* Elastic Fabric Adapter (EFA) support.
  • Added performance optimizations for Intel® Xeon® Platinum 9200 (formerly Cascade Lake-AP)

  • Added non-blocking collective operations support for Autotuner
  • Bug fixes

Intel® MPI Library 2019 Update 5

  • Added autotuner functionality (I_MPI_TUNING_MODE, I_MPI_ADJUST__LIST)
  • Added basic “Wait Mode” support (I_MPI_WAIT_MODE)
  • Added AWS EFA (Elastic Fabric Adapter) support
  • Added OFI/mlx provider as a technical preview for Mellanox EDR/HDR (FI_PROVIDER=mlx)
  • Added Mellanox HCOLL support (I_MPI_COLL_EXTERNAL)
  • Added shared memory allocator (I_MPI_SHM_HEAP, I_MPI_SHM_HEAP_VSIZE, I_MPI_SHM_HEAP_CSIZE, I_MPI_SHM_HEAP_OPT)
  • Added transparent Singularity (3.0+) containers support
  • Added dynamic I_MPI_ROOT path for bash shell
  • Improved memory consumption of OFI/verbs path (FI_PROVIDER=verbs)
  • Improved single node startup time (I_MPI_FABRICS=shm)
  • Disabled environment variables spellchecker by default (I_MPI_VAR_CHECK_SPELLING, I_MPI_REMOVED_VAR_WARNING)
  • Bug fixes

Intel® MPI Library 2019 Update 4

  • Multiple Endpoints (Multi-EP) support for InfiniBand* and Ethernet
  • Implemented the NUMA-aware SHM-based Bcast algorithm (I_MPI_ADJUST_BCAST)
  • Added the application runtime autotuning (I_MPI_TUNING_AUTO)
  • Added the -hosts-group option to set node ranges using square brackets, commas, and dashes (for example, nodeA[01-05],nodeB)
  • Added the ability to terminate a job if it has not been started successfully during a specified time period in seconds (I_MPI_JOB_STARTUP_TIMEOUT)
  • Added the IBM POE* trust processes placement
  • Bug fixes

Intel® MPI Library 2019 Update 3

  • Performance improvements
  • Custom memory allocator is added and available by default in release and debug configurations (I_MPI_MALLOC)
  • MPI-IO enhancements (I_MPI_EXTRA_FILESYSTEM)
  • Bug fixes

Intel® MPI Library 2019 Update 2

  • Intel® MPI Library 2019 Update 2 includes functional and security updates. Users should update to the latest version

Intel® MPI Library 2019 Update 1

  • Performance improvements
  • Conditional Numerical Reproducibility feature is added (I_MPI_CBWR variable)
  • Customized Libfabric 1.7.0 alpha sources and binaries are updated
  • Internal OFI distribution is now used by default (I_MPI_OFI_LIBRARY_INTERNAL=1)
  • OFI*-capable Network Fabrics Control is partially restored (I_MPI_OFI_MAX_MSG_SIZE , I_MPI_OFI_LIBRARY)
  • OFI/tcp provider is added as a technical preview feature
  • Platform recognition is restored (I_MPI_PLATFORM* variables)
  • Spellchecker is added for I_MPI_* variables (I_MPI_VAR_CHECK_SPELLING variable)
  • Multiple bug fixes

Intel® MPI Library 2019

  • Customized Libfabric 1.6.1 sources are included
  • Customized Libfabric 1.6.1 with sockets, psm2, and verbs providers binaries are included
  • PSM2 Multiple Endpoints (Multi-EP) support
  • Asynchronous progress is added as a technical preview feature
  • Multiple bug fixes

Intel® MPI Library 2018 Update 5

  • Bug fixes

Intel® MPI Library 2018 Update 4

  • Bug fixes

Intel® MPI Library 2018 Update 3

  • Performance improvements

Intel® MPI Library 2018 Update 2

  • Improved shm performance with collective operations (I_MPI_SCHED_YIELD, _MPI_SCHED_YIELD_MT_OPTIMIZATION)
  • Intel® MPI Library is now available to install in YUM and APT repositories

Intel® MPI Library 2018 Update 1

  • Improved startup performance on many/multicore systems (I_MPI_STARTUP_MODE)
  • Bug fixes

Intel® MPI Library 2018

  • Improved startup times for Hydra when using shm:ofi or shm:tmi
  • Hard finalization is now the default
  • The default fabric list is changed when Cornelis* Omni-Path Architecture is detected
  • Added environment variables: I_MPI_OFI_ENABLE_LMT, I_MPI_OFI_MAX_MSG_SIZE, I_MPI_{C,CXX,FC,F}FLAGS, I_MPI_LDFLAGS, I_MPI_FORT_BIND
  • Removed support for the Intel® Xeon Phi™ coprocessor (code named Knights Corner)
  • I_MPI_DAPL_TRANSLATION_CACHE, I_MPI_DAPL_UD_TRANSLATION_CACHE and I_MPI_OFA_TRANSLATION_CACHE are now disabled by default
  • Deprecated support for the IPM statistics format
  • Documentation is now online

Intel® MPI Library 2017 Update 4

  • Performance tuning for processors based on Intel® microarchitecture codenamed Skylake and for Cornelis Omni-Path Architecture

Intel® MPI Library 2017 Update 3

  • Hydra startup improvements (I_MPI_JOB_FAST_STARTUP)
  • Default value change for I_MPI_FABRICS_LIST

Intel® MPI Library 2017 Update 2

  • Added environment variables I_MPI_HARD_FINALIZE and I_MPI_MEMORY_SWAP_LOCK

Intel® MPI Library 2017 Update 1

  • PMI-2 support for SLURM*, improved SLURM support by default
  • Improved mini help and diagnostic messages, man1 pages for mpiexec.hydra, hydra_persist, and hydra_nameserver
  • Deprecations:
    • Intel® Xeon Phi™ coprocessor (code named Knights Corner) support
    • Cross-OS launches support
    • DAPL, TMI, and OFA fabrics support

Intel® MPI Library 2017

  • Support for the MPI-3.1 standard
  • New topology-aware collective communication algorithms (I_MPI_ADJUST family)
  • Effective MCDRAM (NUMA memory) support. See the Developer Reference, section Tuning Reference > Memory Placement Policy Control for more information
  • Controls for asynchronous progress thread pinning (I_MPI_ASYNC_PROGRESS)
  • Direct receive functionality for the OFI* fabric (I_MPI_OFI_DRECV)
  • PMI2 protocol support (I_MPI_PMI2)
  • New process startup method (I_MPI_HYDRA_PREFORK)
  • Startup improvements for the SLURM* job manager (I_MPI_SLURM_EXT)
  • New algorithm for MPI-IO collective read operation on the Lustre* file system (I_MPI_LUSTRE_STRIPE_AWARE)
  • Debian Almquist (dash) shell support in compiler wrapper scripts and mpitune
  • Performance tuning for processors based on Intel® microarchitecture codenamed Broadwell and for Cornelis Omni-Path Architecture (Cornelis OPA)
  • Performance tuning for Intel® Xeon Phi™ Processor and Coprocessor (code named Knights Landing) and Cornelis OPA.
  • OFI latency and message rate improvements
  • OFI is now the default fabric for Cornelis OPA and Intel® True Scale Fabric
  • MPD process manager is removed
  • Dedicated pvfs2 ADIO driver is disabled
  • SSHM support is removed
  • Support for the Intel® microarchitectures older than the generation codenamed Sandy Bridge is deprecated
  • Documentation improvements

Known Issues and Limitations

  • Hang with 2021.6 and earlier on Red Hat Enterprise Linux*.  See https://www.intel.com/content/www/us/en/developer/articles/troubleshooting/mpi-library-hang-with-rhel-8-6.html for details.
  • If vars.sh sourced from another script with no explicit parameters, it will inherit parent script options and may process matching ones. 
  • stdout and stderr redirection may cause problems with LSF's blaunch.
    • verbose option may cause a crash with LSF's blaunch. Please do not use -verbose option or set -bootstrap=ssh.
  • To use shared memory only and avoid network initialization on the single node, please explicitly set I_MPI_FABRICS=shm.
  • Application may hang with LSF job manager in finalization if the number of nodes is more than 16. The workaround is setting -bootstrap=ssh or -branch-count=-1.

  • SLURM* option --cpus-per-task in combination with Hydra option -bootstrap=slurm leads to the incorrect pinning. I_MPI_PIN_RESPECT_CPUSET=disable may fix this issue.

  • Incorrect process pinning with I_MPI_PIN_ORDER=spread. Some of the domains may share common sockets.

  • Nonblocking MPI-IO operations on NFS filesystem may work incorrectly for files larger than 2 GB.
  • Some MPI-IO features may not be working on NFS v3 mounted w/o "lock" flag.
  • MPI-IO operations may work unreliable with NFSv3 on Red Hat* Enterprise Linux*/CentOS* 7.4 and 7.5 due to a bug in OS kernel (version 3.10.0-693.el7.x86_64 and 3.10.0-862.el7.x86_64 respectively).

  • MPI_Comm_spawn on ofi/mlx does not work on non-IA platform. As a workaround ofi/verbs provider on small scale may be used.

  • On hybrid processors with efficient E-cores and performance P-cores, such as 12th and 13th Gen Intel® Core™ processors, Intel MPI Library might pin processes to E-cores resulting in not best overall performance.  We recommend disabling pinning with I_MPI_PIN=no, or using I_MPI_PIN_PROCESSOR_LIST and/or I_MPI_PIN_PROCESSOR_EXCLUDE_LIST to explicitly set the process pinning.
  • HBW memory policies applied to window segments for RMA operations are not yet supported.
  • To use the cxi provider, set FI_PROVIDER=cxi, and FI_PROVIDER_PATH and I_MPI_OFI_LIBRARY to point to cxi enabled libfabric. On machines with CXI < 2.0, also set FI_UNIVERSE_SIZE=1024 to bypass a CXI bug that causes a crash otherwise. In case you experience hangs when running with the CXI provider, or see messages about Cassini Event Queue overflow, try increasing the FI_CXI_DEFAULT_CQ_SIZE cvar to values ranging from 16384 to 131072. This is a known issue with the CXI provider. When using 4th Generation Intel® Xeon® Scalable Processors nodes in SNC4 mode, the default CPU pinning (and in turn the nic assignment) is not correct for multiples of 6 ranks and the default GPU pinning is not correct for multiples of 8 ranks. In such cases, it is recommended to explicitly specify CPU, GPU, and NIC pinning using cvars.
  • UCX 1.16.x contains a bug resulting in :
    Caught signal 8 (Floating point exception: floating-point invalid operation)
    The bug is only fixed with commit 1fdcd9f, which is not present in 1.16-rc1. Please make sure to use a version of UCX that contains the bugfix. 

Removals

Starting from Intel® MPI Library 2019, the deprecated obsolete symbolic links and directory structure have finally been removed. If your application still depends on the old directory structure and file names, you can restore them using the script.

Intel® MPI Library 2021 Update 10

  • sockets provider

  • mpitune (replacement: mpitune_fast)

Intel® MPI Library 2021 Update 9

  • Intel® Xeon Phi™ 72xx processor support is removed
  • sockets provider will be removed starting with 2021.10 release
  • mpitune will be removed starting with 2021.10 release (mpitune_fast should be used instead of mpitune)

Intel® MPI Library 2021 Update 5

  • Intel® Xeon Phi™ 72xx processor

Intel® MPI Library 2019 Update 7

  • Intel® Xeon Phi™ 72xx processor (formerly Knights Landing or KNL) support (since Intel(R) MPI Library 2019 Update 6)

Intel® MPI Library 2019 Update 5

  • Intel® Xeon Phi™ 72xx processor (formerly Knights Landing or KNL) support (since Intel(R) MPI Library 2019 Update 6)

Intel® MPI Library 2019 Update 4

  • The -binding command line option and a machine file parameter.
  • Red Hat* Enterprise Linux* 6 support.

Intel® MPI Library 2019 Update 1

  • SLURM startup improvement (I_MPI_SLURM_EXT variable).
  • I_MPI_OFI_ENABLE_LMT variable.

Intel® MPI Library 2019

  • Intel® True Scale Fabric Architecture support.
  • Removed the single-threaded library.
  • Parallel file systems (GPFS, Lustre, Panfs) are supported natively, removed bindings libraries (removed I_MPI_EXTRA_FILESYSTEM*, I_MPI_LUSTRE* variables).
  • Llama support (removed I_MPI_YARN variable).
  • Wait Mode, Mellanox Multirail* support, Checkpoint/Restart* features that depended on substituted fabrics and related variables: I_MPI_CKPOINT*, I_MPI_RESTART, I_MPI_WAIT_MODE).
  • Hetero-OS support.
  • Support of platforms older than Sandy Bridge.
  • Multi-threaded memcpy support (removed I_MPI_MT* variables).
  • Statistics (I_MPI_STATS* variables).
  • Switch pinning method (removed I_MPI_PIN_MODE variable).
  • Process Management Interface (PMI) extensions (I_MPI_PMI_EXTENSIONS variables).

Legal Information

Intel technologies may require enabled hardware, software or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Technical Support

Every purchase of an Intel® Software Development Product includes a year of support services, which provides Priority Support at our Online Service Center web site.

In order to get support you need to register your product in the Intel® Registration Center. If your product is not registered, you will not receive Priority Support.

Additional Resources

Intel® MPI Library

Third Party Programs File