This is an essential guide to using the Coarray Fortran (CAF) feature of the Intel® Fortran Compiler.
The shared memory, single-node, version of Coarray Fortran is available in any edition of Intel® Parallel Studio XE 2015 or newer, including the Intel® Fortran Compiler Classic that is available in the oneAPI HPC Toolkit. The distributed memory implementation of CAF is available for Linux and only in Intel® Parallel Studio XE 2015 Cluster Edition for Linux or newer and the oneAPI HPC Toolkit for Linux.
CAF is also implemented for Windows using Intel Parallel Studio XE Cluster Edition for Windows or the oneAPI HPC Toolkit for Windows.
The CAF feature is currently not available under macOS.
Configuration Set Up
In order to run a distributed memory Coarray Fortran application, you must have an established Linux cluster and an installation of Intel's implementation of MPI (message passage interface). Intel's Coarray Fortran relies on Intel's MPI for cluster communication.
New to Intel MPI? Go here and read, particularly, the Prerequisites section. When you have run the setup scripts, especially sshconnectivity.exp, make sure you can run a simple MPI 'hello world' program on your cluster across multiple nodes. Attached to this article is a small Fortran MPI version of a 'hello world' program.
Successful configuration and running of MPI jobs under Intel MPI is a prerequisite to using the Intel CAF feature in distributed memory mode.
When the 'hello world' program has run successfully, here are additional steps for CAF.
1) Set up a machine file
If your cluster hosts are fixed and you do not run under a batch system like PBS or Slurm, set up a hosts file. In your home directory, create a file with the hostnames of your cluster, one host per line and, optionally, with the number of processors on each host, or the number of CAF images to place on each node. Something like this:
Syntax is <hostname>[:<number of CAF images>] This examples states to run 4 MPI/CAF processes on each node.
You may use any name for the file, by convention "machinefile" or "hostfile" are probably easiest to remember and maintain. If you are running under a batch system where the hosts are assigned dynamically, see the Intel MPI Library Developer Guide for details on host selection.
2) Source the setup scripts
Source Intel MPI and Intel Fortran compiler scripts to set up the paths to Intel MPI and Intel Fortran in your environment. Furthermore, these must be sourced by child processes. It is recommended to perform the following source commands and/or add these to your .bashrc or .cshrc files in your home directory:
source <path to Intel MPI installation>/[ia32 | intel64]/bin/mpivars.sh source <path to Intel Fortran installation>/bin/compilervars.sh [ia32 | intel64]
where you choose between 32 and 64 bit environments with ia32 or intel64, respectively.
3) Setup a Coarray Fortran (CAF) configuration file
When you run a distributed memory Coarray Fortran program, the application first runs a job launcher. The job launcher invoked to start a distributed memory CAF application will use the Intel MPI mpiexec.hydra command to start the job on the hosts in the cluster. This CAF job launcher will first read the CAF configuration file to pick up arguments that will be passed to the Intel MPI mpiexec.hydra command. Thus, the CAF configuration file is nothing more than arguments to the Intel MPI mpiexec.hydra command. An example CAF configuration file may contain:
-genvall -genv I_MPI_FABRICS=shm:tcp -machinefile ./hostsfile -n 8 ./my_caf_prog
In this example,
- -genvall tells the launcher that the child CAF images will inherit environment variable settings from the current shell
- -genv I_MPI_FABRICS=shm:tcp selects the fabric(s) to use in priority order: use shared memory within a node, TCP for remote. See the Intel MPI Library Developer Reference for setting fabrics. Available network fabrics can vary with the Intel MPI version.
- -machinefile ./hostsfile says to find the list of cluster nodes on which to run. For batch based systems see the Intel MPI Library Developer Guide.
- -n 8 launch 8 CAF images
- ./my_caf_prog is the name of the Coarray Fortran program
There are many, many commands and configurations possible in the machine file. See the documentation on mpiexec.hydra for a complete list of possible control options. Some other useful options to consider:
- -rr round-robin image distribution of images to nodes. DEFAULT. Round-robin is one way to avoid using hyperthreaded cores. With the -rr option to mpiexec.hydra, image 1 is assigned to host1 from your machine file or PBS_NODEFILE, image 2 to host2, etc. to image N on hostN, at which point the allocation cycles back to image N+1 on host1 and so on.
- -perhost N distribute images to hosts in groups of N. OPTIONAL. This is another way to avoid hyperthreaded cores: set N to the number of real cores on each host. image 1..N are allocated on host1, images N+1 to N+N on host2, etc.
Building the Application
You are now ready to compile your Coarray Fortran application. Create or use an existing Coarray Fortran application. A sample Coarray Fortran 'hello world' application is included in the <compiler install dir>Samples/en_US/Fortran/coarray_samples/ directory. The essential compiler arguments to use for distributed memory coarray applications are:
ifort -coarray=distributed -corray-config-file=<CAF config filename> ./my_caf_prog.f90
- -coarray=distributed is necessary to create a distributed memory CAF application. This option is only available on systems with a valid Intel Cluster Edition or Cluster Studio license. Without this license you cannot create distributed memory Coarray Fortran applications. You can, however, create and use shared memory CAF applications with any existing Intel Composer XE for Linux or Windows license.
- -coarray-config-file=<CAF configuration file> is used to set tell the CAF job launcher where to find the configuration file with runtime arguments for mpiexec.hydra. This file need not exist at the time of compilation. This file is ONLY read at job launch. Thus, it can be changed or modified between job runs to change the number of images along with any other valid control option to mpiexec.hydra. This gives the programmer a way to change the number of images and other parameters without having to recompile the application. A reasonable name for the file may be ~/cafconfig.txt, but the name of the file and location is up to the user to decide.
- The executable name is hard-coded in the CAF config file, so be sure that the executable name in the config file matches the name you used with the 'ifort -o <name>' option. Also, be sure to use either the full pathname to the executable OR the current directory "dot" name, such as './a.out' or './mycafprogram.exe'.
- -coarray-num-images=N compiler option is ignored for -coarray=distributed. This option is only used by shared memory Coarray Fortran applications. The number of images for distributed memory CAF applications is ONLY controlled at job launch by the '-n N' option in the CAF config file.
Of course, you can include any other compiler options including all optimization options.
Running the Application
After compiling the program, simply execute the program.
That's it! The CAF executable will locate your machine file and your caf config file. The caf config file passes arguments to the mpiexec.hydra command to start up your distributed CAF program. Host information is pulled from the machine file.
Need to change the number of images launched or the arguments to mpiexec.hydra? Simply change the settings in the CAF config file. Remember, the -coarray-config-file= options used at compile time set the name and location for this file. You should use a name and location you can remember for this file, such as -coarray-config-file=~/cafconfig.txt
Then just add mpiexec.hydra options to ~/cafconfig.txt, for example,
-perhost 2 -envall -n 64 ./a.out
Note: The environment variable FORT_COARRAY_NUM_IMAGES has no effect on distributed memory CAF applications. This environment variable is only honored by a shared memory CAF image. Only the -n option in the CAF config file is used to control the number of CAF images for a distributed memory CAF application.
Again, read the mpiexec.hydra and I_MPI_FABRICS documentation in the Intel MPI Library Developers Reference.
Known Issues or Limitations
Many clusters have multiple MPI implementations installed along with Intel MPI. The PATH and LD_LIBRARY_PATH environment variables must have Intel MPI paths BEFORE any other MPI installed on your system. Make sure to ONLY source the mpivars.sh script to set this correctly OR insure that the correct Intel MPI paths appear before other MPI paths.
Batch system notes: In the above notes, we added the option '-envall' to the CAF config file. This is an attempt to get your current working environment variables to be inherited by your spawned remote CAF processes. This was done to help insure that your PATH and LD_LIBRARY_PATH contain the paths to Intel MPI and Intel Fortran AND that those paths appear before other MPI and compilers on your system. HOWEVER, some batch scheduling systems will not allow environment inheritance. In other words they will throw out your current environment variables and use defaults for these. That is why we suggested adding
source <path to intel MPI>/[ia32 | intel64]/bin/mpivars.sh
to your .bashrc, .cshrc, or .bash_profile. These dot files are invoked by each child process created, and hence, SHOULD set the PATH and LD_LIBRARY_PATH appropriately. When in doubt, execute 'which mpiexec.hydra' interactively, or put 'echo `which mpiexec.hydra`' in your batch script to insure the Intel MPI mpiexec.hydra is being used. Other MPI implementation 'mpiexec' commands cannot be used and will cause errors.
It is critical to insure that you can execute an Intel MPI application PRIOR to attempting to run an Intel CAF program.
READ: the Intel MPI Release Notes and the Getting_Started.pdf documents that come with Intel MPI in the <installdir>/doc/ directory.
Our User Forums are great places to see current issues and to post questions:
Intel Fortran User Forum
oneAPI HPC Toolkit Forum (MPI)
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.