Introduction
The Intel® oneAPI Math Kernel Library (oneMKL) is a set of optimized math functions that allow speed up of compute-intense applications in the areas of science, engineering, or financial applications. oneMKL can be used to optimize code for current and future generations of Intel® CPUs and GPUs and other accelerators. This article will show how to offload the computation to Intel GPUs using OpenMP* offload in C and Fortran.
Requirements
This page contains hardware and software requirements to run oneMKL. To get oneMKL, please download and install the Intel® oneAPI Base Toolkit. In order to use OpenMP* offload, the Intel® oneAPI HPC Toolkit is also needed.
Examples
This section will show how use oneMKL matrix multiply function with OpenMP* offload using C and Fortran.
C Implementation
Explanation:
These include files are needed for oneMKL routines and OpenMP* Offload.
This is the device ID for the GPU.
Create and initialize parameters for the oneMKL cblas_gemm routine.
Allocate memories for matrices a, b and c.
Initialize those matrices.
Map data to the device indicated by device(dnum). Note that “to” means the device will read from that data (matrices a and b) while “tofrom” (matrix c) means after the device can read from and write to. The device will write the result to matrix c before sending it back to host.
The pragma “ omp target variant…” will tell the device to execute whatever code following it. In this case, it tells the device to execute the oneMKL routine cblas_dgemm. If the device is not found, it will default back to the host device.
Release the memory used by the oneMKL routine.
Fortran Implementation
Explanation:
It is very similar to the C implementation.
This portion of the code maps the data to the device and execute the oneMKL routine in the device:
Building and Linking
For C
The Intel® oneAPI DPC++/C++ Compiler (icx) is needed for OpenMP* offload.The following shows how to build and dynamic link in sequential mode to oneMKL.
For Fortran
The Intel® Fortran Compiler (Beta) (ifx) is needed for OpenMP* offload. The following shows how to build and dynamic link in parallel (OpenMP*) mode to oneMKL.
You can also use the Link Line Advisor to build and link your code.
Conclusion
OpenMP* offload allows applications written in C and Fortran to run on accelerators other than the Intel® CPUs. This is very important since there are a lot of legacy applications out there implemented in C and Fortran. It would normally be not practical to rewrite a large application from scratch using a different language in order to run on GPUs.
Notices & Disclaimers
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.
Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.
Your costs and results may vary.
Intel technologies may require enabled hardware, software or service activation.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.