The most significant parameters in HPL.dat are P, Q, NB, and N. Specify them as follows:
P and Q - the number of rows and columns in the process grid, respectively.
P*Q must be the number of MPI processes that HPL is using.
NB - the block size of the data distribution.
The table below shows recommended values of NB for different Intel® processors:
Intel® Xeon® Processor X56*/E56*/E7-*/E7*/X7* (codenamed Nehalem or Westmere)
Intel Xeon Processor E26*/E26* v2 (codenamed Sandy Bridge or Ivy Bridge)
Intel Xeon Processor E26* v3/E26* v4 (codenamed Haswell or Broadwell)
Intel® Core™ i3/i5/i7-6* Processor (codenamed Skylake Client)
Intel® Xeon Phi™ Processor 72* (codenamed Knights Landing)
Intel Xeon Processor supporting Intel® Advanced Vector Extensions 512 (Intel® AVX-512) instructions (codenamed Skylake Server)
N - the problem size:
For homogeneous runs, choose N divisible by NB*LCM(P,Q), where LCM is the least common multiple of the two numbers.
For heterogeneous runs, see Heterogeneous Support in the Intel® Distribution for LINPACK* Benchmark for how to choose N.
Increasing N usually increases performance, but the size of N is bounded by memory. In general, you can compute the memory required to store the matrix (which does not count internal buffers) as 8*N*N/(P*Q) bytes, where N is the problem size and P and Q are the process grids in HPL.dat. A general rule of thumb is to choose a problem size that fills 80% of memory.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.
Notice revision #20201201