This article describes a method to compile and run a distributed memory coarray program using Intel® Parallel Studio XE Cluster Edition. An example using Linux* is presented.
For shared memory application: Intel® Fortran Compiler XE 15.0 or newer, including Intel oneAPI HPC Toolkit (all releases)
For distributed memory application: Intel® Parallel Studio XE 2015 Cluster Edition for Linux and Windows or newer, including Intel oneAPI HPC Toolkit (all releases)
To compile for distributed memory coarrays, use compiler option -coarrays=distributed (Linux* OS) or /Qcoarrays:distributed (Windows* OS).
To compile for shared memory coarrays, use compiler option -coarrays=shared (Linux* OS) or /Qcoarrays:shared (Windows* OS).
Obtaining Example Source Code
The coarray example from the Composer XE 'coarray_samples' directory is available.
An Intel® Parallel Studio XE Cluster Edition or Intel® oneAPI HPC Toolkit is required for compilation, and the Intel® MPI Library must be installed on the cluster nodes.
Configuration Set Up
A key for running a distributed memory coarray program with process pinning on specific nodes is to build with the compiler option -coarray-config-file=filename (Linux*)or /Qcoarray-config-file:filename (Windows*). This enables you to take full advantage of Intel® MPI Library features in the coarrays environment, in the same way that 'mpiexec -config filename' allows mpiexec to take its commands from 'filename'.
The contents of the configuration file for this example:
-host host1 -env I_MPI_PIN_PROCESSOR_LIST 0,2,4 -n 3 <path to executable>/coarry_dist_host.x : -host host2 -env I_MPI_PIN_PROCESSOR_LIST 1,3,5 -n 3 <path to executable>/coarry_dist_host.x
This says to execute six coarray images of executable 'coarry_dist_host.x' on nodes host1 and host2, using processors 0,2,4 on host1, and processors 1,3,5 on host2. The I_MPI_PIN_PROCESSOR_LIST environment variable is used to achieve the process pinning on the indicated nodes.
Source Code Changes
See Verifying Correctness
Building the Application
Compile for distributed coarrays, create one coarray image, and specify the coarray configuration file:
ifort -coarray=distributed -coarray-num-images=1 -coarray-config-file=coarray_config.txt coarry_dist_host.f90 -o coarry_dist_host.x
Running the Application
Simply specify the name of the executable
> <path to executable>/coarry_dist_host.x Hello from image 1 out of 6 total images, and running on host: host1 Hello from image 2 out of 6 total images, and running on host: host1 Hello from image 3 out of 6 total images, and running on host: host1 Hello from image 5 out of 6 total images, and running on host: host2 Hello from image 4 out of 6 total images, and running on host: host2 Hello from image 6 out of 6 total images, and running on host: host2 >
Embed call hostnm(hostname) in your coarray program, then print 'hostname' to verify the images are executed on the correct nodes/processors.
This method enables coarray image pinning on specific nodes/node processors. Better load balance across cluster nodes might be obtained, or a subset of nodes easily partitioned.
Known Issues or Limitations
Distributed memory coarrays only work with Intel® MPI; other implementations of MPI are not supported.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.