The R statistical computing package is used in many areas, like IT, finance, e-commerce, healthcare, manufacturing, and so on. This article will show how to boost the performance of R using the Intel® oneAPI Math Kernel Library (oneMKL), but first, let’s explore what R is all about.
R is an open-source programming environment for statistics and data analytics. It is also a domain-specific programming language for statistics (see www.r-project.org for more information). Many R users don’t realize that they can easily boost the performance of their computations significantly by linking to a high- performance math library like oneMKL.
oneMKL contains highly optimized, threaded, and vectorized functions for common mathematical operations that are used for scientific, engineering, and financial applications (Figure 1). It covers dense and sparse linear algebra (BLAS, LAPACK, PARDISO), fast Fourier transforms, vector math, summary statistics, splines, and much more. It runs optimized code for a given processor automatically without the need to branch code. It is also optimized for single-core vectorization and cache utilization. Finally, it automatically uses parallelism for multi-core CPUs and GPUs, and scales some computations from single systems to clusters.
Figure 1 Mathematical domains covered by oneMKL
oneMKL is part of the Intel® oneAPI Base Toolkit. The Intel® oneAPI HPC Toolkit is also required to link to R. Both packages can be freely downloaded:
Linking R to oneMKL confers significant performance advantages, as we’ll see below, without requiring developers to change their R code. oneMKL is one layer underneath the R application. It works with the R engine to use appropriate oneMKL functions to improve performance. The oneMKL functions will automatically take advantage of hardware features in Intel® processors like Intel® Advanced Vector Extensions 512 (Intel® AVX-512), Intel Advanced Vector Extensions 2 (Intel AVX2), and Intel Advanced Vector Extensions (Intel AVX). oneMKL is designed to allow developers to focus on their applications without worrying about the underlying hardware. For example, a oneMKL application created on a system that supports Intel AVX will also take advantage of Intel AVX-512 if moved to a system that supports the later extensions.
Linking R to oneMKL is straightforward (see Quick Linking Intel® MKL BLAS, LAPACK to R). (Note that the following instructions are for Linux.) Linking R to oneMKL will redirect appropriate R functions to optimized oneMKL functions; however, it is important to set certain oneMKL environment variables to work with R:
$ export MKL_INTERFACE_LAYER=GNU,LP64
$ export MKL_THREADING_LAYER=GNU
These environment variables set the MKL interface and threading layer to GNU and LP64. To verify that R is linked to oneMKL, run sessionInfo() from the R command prompt (Figures 2 and 3).
Figure 2 Output from sessionInfo() showing R without oneMKL
Figure 3 Output from sessionInfo() showing R using oneMKL
In general, R users don’t have to do anything more than simply link to oneMKL; however, there are a few things they can do to help:
- Although R is single-threaded, oneMKL can run in either single- or multi-threaded mode. oneMKL uses multiple threads by default (i.e., the environment variable MKL_DYNAMIC=TRUE). This is usually best, provided there’s enough work to justify thread creation. In other words, oneMKL works best for large datasets.
- It is important not to oversubscribe the system (i.e., using more threads than available processors). The common practice is to set the number of threads equal to the number of cores in the system.
- This doesn’t always give best performance if there isn’t enough work to saturate available resources, however. In this case, manually setting the number of threads can give better performance (e.g., setting the environment variables MKL_DYNAMIC=FALSE and MKL_NUM_THREADS=4).
- Disable hyperthreading in the BIOS.
- Set the environment variable MKL_VERBOSE=1 to see which oneMKL functions are being called and how many threads are being used.
The R Benchmarks (v2.5) are used to measure the performance improvement from linking R to oneMKL. The benchmark takes 27.9 seconds when R is not linked to oneMKL:
The benchmark takes 2.8 seconds when R is linked to oneMKL:
This is a 9.8x speedup just by linking R to oneMKL. No R code modifications were needed.
Toward Accurate and Highly Performant Simulations with BRODA* Sobol Quasi-random Number Generator
Speed Up Monte Carlo Simulations with Intel oneAPI Math Kernel Library
A Vendor-Neutral Path to Math Acceleration
Develop in a Heterogeneous Environment with Intel® oneAPI Math Kernel Library (oneMKL)