MPI_THREAD_SPLIT Programming Model
The communication patterns that comply with the thread-split model must not allow cross-thread access to MPI objects to avoid thread synchronization and must disambiguate message matching, so that threads could be separately addressed and not more than one thread could match the message at the same time. Provided that, the user must notify the Intel MPI Library that the program complies with thread-split model, that is, it is safe to apply the high performance optimization.
Each MPI_THREAD_SPLIT-compliant program can be executed correctly with a thread-compliant MPI implementation under MPI_THREAD_MULTIPLE, but not every MPI_THREAD_MULTIPLE-compliant program follows the MPI_THREAD_SPLIT model.
This model allows MPI to apply optimizations that would not be possible otherwise, such as binding specific hardware resources to concurrently communicating threads and providing lockless access to MPI objects.
Since MPI_THREAD_SPLIT is a non-standard programming model, it is disabled by default and can be enabled by setting the environment variable I_MPI_THREAD_SPLIT. If enabled, the threading runtime control must also be enabled to enable the programming model optimizations (see Threading Runtimes Support).
Setting the I_MPI_THREAD_SPLIT variable does not affect behavior at other threading levels such as SINGLE and FUNNELED. To make this extension effective, request the MPI_THREAD_MULTIPLE level of support at MPI_Init_thread().
NOTE: Thread-split model has support for MPI point-to-point operations and blocking collectives.
MPI_THREAD_SPLIT Model Description
As mentioned above, an MPI_THREAD_SPLIT-compliant program must be at least a thread-compliant MPI program (supporting the MPI_THREAD_MULTIPLE threading level). In addition to that, the following rules apply:
- Different threads of a process must not use the same communicator concurrently.
- Any request created in a thread must not be accessed by other threads, that is, any non-blocking operation must be completed, checked for completion, or probed in the same thread.
- Communication completion calls that imply operation progress such as MPI_Wait(), MPI_Test() being called from a thread don’t guarantee progress in other threads.
The model implies that each process thread has a distinct logical thread number thread_id. thread_id must be set to a number in the range 0 to NT-1, where NT is the number of threads that can be run concurrently. thread_id can be set implicitly, or your application can assign it to a thread. Depending on the assignment method, there are two usage submodels:
- Implicit model: both you and the MPI implementation know the logical thread number in advance via a deterministic thread number query routine of the threading runtime. The implicit model is only supported for OpenMP* runtimes via omp_get_thread_num().
- Explicit model: you pass thread_id as an integer value converted to a string to MPI by setting an MPI Info object (referred to as info key in this document) to a communicator. The key thread_id must be used. This model fits task-based parallelism, where a task can be scheduled on any process thread.
The I_MPI_THREAD_ID_KEY variable sets the MPI info object key that is used to explicitly define the thread_id for a thread (thread_id by default).
Within the model, only threads with the same thread_id can communicate. To illustrate it, the following communication pattern complies to the MPI_THREAD_SPLIT model: Suppose Comm A and Comm B are two distinct communicators, aggregating the same ranks. The system of these two communicators will fit the MPI_THREAD_SPLIT model only if all threads with thread_id #0 use Comm A, while all threads with thread_id #1 use Comm B.