The Advanced SkipAhead Method for Parallel RNG

Published: 12/11/2019  

Last Updated: 12/11/2019

By Gennady Fedorov


This article discussed the advanced SkipAhead method - a new export functionality introduced in the Intel® Math Kernel Library (Intel® MKL), version 2020 [1] for the Intel® MKL Random Number Generators.

This functionality allows you to skip a given number of elements (greater than 2^63 which is the current limitation of the standard SkipAhead method) in random streams. This feature is particularly useful in distributing random numbers from the original random stream across different computational nodes.


One of the basic requirements for random number streams is their mutual independence and lack of intercorrelation. Independent streams can be obtained by various methods, one of which is the Block-splitting method, also known as SkipAhead. In this method, the original sequence is split into k non-overlapping blocks, where k is the number of independent streams. Each independent stream generates its own subsequence of size “nskip” (number of skipped elements in a stream).

But what if you do not know how many numbers each independent stream will generate? In this case, you can set a number of skipped elements “nskip” to ensure that the number of generated elements by each stream maintains the requirement for independence and lack of intercorrelation. Many RNG libraries provide a "substream based" type of parallel random number generator in which the initial stream is divided into several substreams that are separated from each other by a large number of random numbers( for example 2^127).

For the basic SkipAhead method, the number of skipped elements is limited to 2^63. In some cases, this insufficient. For example, an initial sequence is split into subsequences of 2^76 numbers based on spectral tests to achieve the best quality of generated subsequences [2]. The new advanced SkipAhead method extends the number of skipped elements using the following representation:

Number of skipped element=nskip0+nskip1*  2^64 + nskip2^128+…+nskipn-1 2^64*n-1;

This method also provides a significant performance gain in comparison with multiple calls of standard SkipAhead method [2].

Usage Model Example

The following example of the advanced SkipAhead method skips 2^128 elements. To skip  2^128 elements ‘nskip’ should be represented as follows:

Number of skipped elements= 0*2^+ 0*2^64+1*2^128;

C-code for the advance SkipAhead method with 2^128 skipped elements:

VSLStreamStatePtr stream;
MKL_UINT64 nskip[3];

vslNewStream( &stream, brng, seed); 
vslSkipAheadStreamEx( stream, 3, nskip); 

Performance Comparison with RngStreams

The performance comparison is based on the calculation of Pi using the Monte Carlo Method. Results are introduced for a different number of streams. Note: each stream generates random numbers on its own CPU core.

For Intel® MKL RNG to create independent streams, the advanced SkipAhead method is used with the number of skipped elements  = 2^127. In the RngStreams library, streams are separated from each other by 2^127 random numbers.

Performance results are presented below:


Used Hardware: Intel® Xeon Platinum 8180 2x28@ 2.5 GHz, RHEL 7.2.

Used Software: Intel® MKL 2020; RngStreams library - is an object-oriented random-number package with many long streams and substreams, based on the  MRG32k3a RNG. [3]. 

RNG assumptions: Number of generated random numbers 2*10^9​. Basic random number generator: MRG32k3a; Random number distribution: Uniform double precision.



[2] Pierre L'Ecuyer , Richard Simard , E. Jack Chen , W. David Kelton, An Object-Oriented Random-Number Package with Many Long Streams and Substreams, Operations Research, v.50 n.6, p.1073-1075, November 2002  [doi>10.1287/opre.50.6.1073.358]



Dyakov, Pavel

Fedorov, Gennady

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at