Running the Intel® Distribution for LINPACK* Benchmark
To run the Intel® Distribution for LINPACK Benchmark on multiple nodes or on one node with multiple MPI processes, you need to use MPI and either modify HPL.dat or use Ease-of-use Command-line Parameters. The following example describes how to run the dynamically-linked prebuilt Intel® Distribution for LINPACK Benchmark binary using the script provided. To run other binaries, adjust the steps accordingly; specifically, change line 58 of runme_intel64_dynamic to point to the appropriate binary.
Load the necessary environment variables for the Intel MPI Library and Intel® compiler:
In HPL.dat, set the problem size N to 10000. Because this setting is for a test run, the problem size should be small.
For better performance, enable non-uniform memory access (NUMA) on your system and configure to run an MPI process for each NUMA socket as explained below.
Refer to your BIOS settings to enable NUMA on your system.
Set the following variables at the top of the runme_intel64_dynamic script according to your cluster configuration:
- The total number of MPI processes.
- The number of MPI processes per each cluster node.
In the HPL.dat file, set the parameters Ps and Qs so that Ps * Qs equals the number of MPI processes. For example, for 2 processes, set Ps to 1 and Qs to 2. Alternatively, leave the HPL.dat file as is and launch with -p and -q command-line parameters.
Execute runme_intel64_dynamic script:
Rerun the test increasing the size of the problem until the matrix size uses about 80% of the available memory. To do this, either modify Ns in line 6 of HPL.dat or use the -n command-line parameter:
For 16 GB: 40000 Ns
For 32 GB: 56000 Ns
For 64 GB: 83000 Ns
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.
Notice revision #20201201