Extending R with Intel MKL

Todd Rosenquist

Introduction

The Using Intel oneMKL with R article discusses building R with the Intel® oneAPI Math Kernel Library (oneMKL) BLAS and LAPACK to improve the performance of those parts of R that rely on matrix computations. Since the BLAS and LAPACK routines are a long-standing set of functions with a standard API, the R build scripts have a built-in facility to use these. Setting R to use oneMKL can provide significant performance improvement on parts of R, but there are other functions in oneMKL that can provide great performance as well. This article will give you a quick tutorial on how to extend R by writing a wrapper for a function in oneMKL and call it from your R script.

The wrapper

Let's begin by selecting a simple function in oneMKL that is not a part of the BLAS or LAPACK. For this exercise we'll choose the vdPow function which takes two arrays, x and y, and creates a new array z whose nth element is x[n] to the y[n] power.

#include "R.h" 
#include "Rdefines.h" 
#include "Rinternals.h" 
#include "mkl.h" 
SEXP mkl_vdpow(SEXP N, SEXP X, SEXP Y) {   
  SEXP Z;   
  int *n;   
  int nn;   
  double *x, *y, *z;     
/* Convert input arguments to C types */   
  PROTECT(N = AS_INTEGER(N));   
  PROTECT(X = AS_NUMERIC(X));   
  PROTECT(Y = AS_NUMERIC(Y));     
  n = INTEGER_POINTER(N);   
  x = NUMERIC_POINTER(X);   
  y = NUMERIC_POINTER(Y);     
/* Allocate memory to store results */   
  nn = *n;   
  PROTECT(Z = NEW_NUMERIC(nn));   
  z = NUMERIC_POINTER(Z);     
  vdPow(nn, x, y, z);     
  UNPROTECT(4);     
  return Z; 
}

Building the wrapper

On Windows this is easily done by obtaining the Windows package and Rtools from CRAN. I downloaded R 4.4.1 and Rtools version 4.4 and then created a file to set my environment as follows:

	SET "MINGW_ROOT=C:\rtools44\x86_64-w64-mingw32.static.posix"

	SET "PATH=%MINGW_ROOT%\bin;%MINGW_ROOT%\libexec\gcc\x86_64-w64-mingw32.static.posix\13.2.0;%PATH%"

    "C:\Program Files (x86)\Intel\oneAPI\compiler\2024.2\env\vars.bat"

    "C:\Program Files (x86)\Intel\oneAPI\mkl\2024.2\env\vars.bat"

In the above instructions, the compiler environment variables are sourced for the OpenMP* runtime.

I then build using the following commands:

    SET "R_HOME=C:\Program Files\R\R-4.4.1"

    SET "PATH=%R_HOME%\bin\x64;%PATH%"

	gcc -c -O2 -Wall -I"%R_HOME%\include" -I"%MKLROOT%\include" pow_wrp.c -o pow_wrp.o

	gcc -shared -Wl,-soname,pow_wrp.so -o pow_wrp.so pow_wrp.o "%R_HOME%\bin\x64\R.dll" "%MKLROOT%\lib\mkl_rt.lib"

Calling the function from R

Here then is a simple script that calls the function we've just built

dyn.load("c:/full_path_or_directory_on_path/pow_wrp.so")

mkl_pow <- function(n, x, y) .External("mkl_vdpow", n, x, y)

n <- 1000

x <- runif(n, min=2, max=10)

y <- runif(n, min=-2, max=-1)

z <- mkl_pow(n, x, y)

Performance

Let's see what sort of performance difference the use of Intel® oneMKL can make by comparing this new power function with the simple means available within R. To do the similar operation in R we could use a loop to perform each scalar operation. The following is the script running with Rscript on my laptop (Intel® oneMKL 2024.2.0, R-4.4.1, 4-core Intel® Core™ i7-1187G7 processor (3.00GHz) 16GB RAM, Windows* 11):

dyn.load("c:/test/pow_R/pow_wrp.so")

mkl_pow <- function(n, x, y) .Call("mkl_vdpow", n, x, y)

n <- 1000000
x <- runif(n, min=2, max=10)
y <- runif(n, min=-2, max=-1)

start <- proc.time()
z <- mkl_pow(n, x, y)
end1 <- proc.time() - start

end1

i <- n

start <- proc.time()
repeat{ 
    z[i] <- x[i]^y[i] 
    i <- i - 1 
    if (i==0) break() }
end2 <- proc.time() - start

end2

The output of the above script in the Command Prompt is as following,

C:\test\pow_R>Rscript test.R
   user  system elapsed
   0.00    0.00    0.01
   user  system elapsed
   0.42    0.00    0.22

For this very large problem size the oneMKL function makes a large difference. The loop approach takes 0.22 seconds while the oneMKL function takes 0.01 seconds. For smaller problems the performance will be less, but for systems with higher core-counts, compute intensive functions can show even greater speed-up.

Given the potential for large performance improvements, it may be worthwhile finding out the functions available in oneMKL.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Extending R* with Intel® oneAPI Math Kernel Library (oneMKL)

Product and Performance Information