Running the Intel® Distribution for LINPACK* Benchmark and the Intel®...

Developer Guide

Developer Guide for Intel® oneAPI Math Kernel Library Windows*

Download PDF

ID 766692

Date 3/22/2024

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Visible to Intel only — GUID: GUID-6F7041FA-C376-4F01-8D73-118B45BEDF08

View Details

Running the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark

To run the Intel® Distribution for LINPACK Benchmark on multiple nodes or on one node with multiple MPI processes, you need to use MPI and either modify HPL.dat or use Ease-of-use Command-line Parameters. The following example describes how to run the dynamically-linked prebuilt Intel® Distribution for LINPACK Benchmark binary using the script provided. To run other binaries, adjust the steps accordingly.

Load the necessary environment variables by accessing setup scripts.
In HPL.dat, set the problem size N to 10000. Because this setting is for a test run, the problem size should be small.
For better performance, enable non-uniform memory access (NUMA) on your system and configure to run an MPI process for each NUMA socket as explained below.
- Refer to your BIOS settings to enable NUMA on your system.
- Set the following variables at the top of the runme_intel64_dynamic.bat script according to your cluster configuration:
  
  MPI_PROC_NUM
  
  The total number of MPI processes.
  
  MPI_PER_NODE
  
  The number of MPI processes per each cluster node.
  
  NUMA_PER_MPI
  
  The number of NUMA nodes per each MPI process.
  
  USE_HPL_AI
  
  Uncomment this to enable the Intel® Optimized HPL-AI* Benchmark.
  
  USE_HPL_GPU
  
  Uncomment this to enable GPUs.
  
  HPL_NUMSTACK
  
  The number of stacks on each GPU.
  
  HPL_NUMDEV
  
  The number of GPUs.
- In the HPL.dat file, set the parameters Ps and Qs so that Ps * Qs equals the number of MPI processes. For example, for two processes, set Ps to 1 and Qs to 2. Alternatively, leave the HPL.dat file as is and launch with the -p and -q command-line parameters.
Execute the runme_intel64_dynamic.bat script:

runme_intel64_dynamic.bat
Rerun the test, increasing the size of the problem until the matrix size uses about 80% of the available memory. To do this, either modify Ns in line 6 of HPL.dat or use the -n command-line parameter:
- For 16 GB: 40000 Ns
- For 32 GB: 56000 Ns
- For 64 GB: 83000 Ns

Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex. Notice revision #20201201

Product and Performance Information

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Notice revision #20201201

Parent topic: Intel® Distribution for LINPACK* Benchmark and Intel® Optimized HPL-AI* Benchmark

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Developer Guide for Intel® oneAPI Math Kernel Library Windows*

Running the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark