Intel® oneAPI Math Kernel Library
Highly optimized, fast, and complete library of math functions for Intel® CPUs and GPUs. Accelerate math processing routines, increase application performance, and reduce development time.
Give Intel your input on Intel® oneAPI Math Kernel Library (oneMKL) to help make improvements to meet your needs:
Intel®-Optimized Math Library for Numerical Computing on CPUs & GPUs
Optimized Library for Scientific Computing
- The fastest and most-used math library for Intel®-based systems†
- Enhanced math routines enable developers and data scientists to create performant science, engineering, or financial applications
- Core functions include BLAS, LAPACK, sparse solvers, fast Fourier transforms (FFT), random number generator functions (RNG), summary statistics, data fitting, and vector math
- Optimizes applications for current and future generations of Intel CPUs, GPUs, and other accelerators
- Is a seamless upgrade for previous users of the Intel® Math Kernel Library (Intel® MKL)
What's New
- The oneMKL SYCL library was partitioned to provide a smaller binary footprint for developers and users of oneMKL
- Increased CUDA* library function API compatibility coverage on Intel CPUs and GPUs
- Delivers High-Performance LINPACK (HPL) and HPL-AI benchmarks optimized for Intel® Xeon® CPU Max Series and Intel® Data Center GPU Max Series
- BLAS
- Improved general performance of GEMV and several BLAS level-1 routines on Intel Data Center GPU Max Series
- DFT
- Enabling FFT larger than 4 GiB (up to 64 GiB of data) on Intel Data Center GPU Max Series.
- Improved FFT performance on Intel Data Center GPU Max Series
- LAPACK
- Introducing SYCL APIs to compute nonpivoting LU factorization with C and Fortran OpenMP* offload support
- Introducing SYCL APIs to compute batched matrix inverse of a group of general matrices
- Vector Math
- Integrates vector math optimizations into random number generators for high-performance computing
- Supports vector math for FP16 datatype on Intel GPUs
- Added OpenMP 5.1 for C offload support
Download as Part of the Toolkit
oneMKL is included in the Intel oneAPI Base Toolkit, which is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.
Download the Stand-Alone Version
A stand-alone download of oneMKL is available. You can download binaries from Intel or choose your preferred repository.
Develop in the Cloud
Build and optimize oneAPI multiarchitecture applications using the latest Intel-optimized oneAPI and AI tools, and test your workloads across Intel® CPUs and GPUs. No hardware installations, software downloads, or configuration necessary.
Help oneMKL Evolve
oneMKL is part of the oneAPI industry standards initiative. We welcome you to participate.
What You Need
- Get started by choosing the best interface for your application:
- oneMKL is available as part of the Intel® oneAPI Base Toolkit.
- Using oneMKL with Intel® MPI library or Intel® Fortran Compilers requires the Intel® HPC Toolkit.
Features
Linear Algebra
Speed up linear algebra computations with low-level routines that operate on vectors and matrices, and are compatible with these industry-standard BLAS and LAPACK operations:
- Level 1: Vector-vector operations
- Level 2: Matrix-vector operations
- Level 3: Matrix-matrix operations
Sparse Linear Algebra Functions
Perform various operations on sparse matrices with low-level and inspector-executor routines including the following:
- Multiply sparse matrix with dense vector
- Multiply sparse matrix with dense matrix
- Solve linear systems with triangular sparse matrices
- Solve linear systems with general sparse matrices
Fast Fourier Transforms (FFT)
Transform a signal from its original domain (typically time or space) into a representation in the frequency domain and back. Use FFT functions in one, two, or three dimensions with support for mixed radices. The supported functions include complex-to-complex and real-to-complex transforms of arbitrary length in single-precision and double-precision.
Random Number Generator Functions (RNG)
Use common pseudorandom, quasi-random, and nondeterministic random number engines to solve continuous and discrete distributions.
Data Fitting
Provide spline-based interpolation capabilities that you can use to approximate functions, function derivatives or integrals, and perform cell search operations.
Vector Math
Balance accuracy and performance with vector-based elementary functions. Manipulate values with traditional algebraic and trigonometric functions.
Summary Statistics
Compute basic statistical estimates (such as raw or central sums and moments) for single- and double-precision multidimensional datasets.
Benchmarks
These benchmarks are offered to help you make informed decisions about which routines to use in your applications, including performance for each major function domain in oneMKL by processor family. Some benchmark charts only include absolute performance measurements for specific problem sizes. Others compare previous versions, popular alternate open source libraries, and other functions for oneMKL.
To assess performance in high-performance computing environments, see the oneMKL Benchmarks Suite. The suite includes the Intel® Distribution for LINPACK* Benchmark, Intel® Distribution for MP LINPACK* Benchmark for clusters, and Intel® Optimized High Performance Conjugate Gradient Benchmark from the latest oneMKL release.
Threaded SGEMM and BF16 GEMM Performance
LAPACK Performance Scaling at High Thread Count
Performance Advantage of one M K L FFT over FFTW library
Fast Fourier Transform Performance
Sparse Matrix-Vector Product Performance
Random Number Generator Performs Best with High Thread Count
Vector Math Function Performance
PARDISO Runs Faster than MUMPS Library
Documentation & Code Samples
Documentation
- Get Started Guide
- Release Notes
- System Requirements
- Developer References:
C | Fortran | SYCL - Developer Guides:
Windows* | Linux*
View Current oneMKL Documentation
View Legacy Intel® Math Kernel Library Documentation
Library Linking Guidance
Intel® oneAPI Math Kernel Library Link Line Advisor
This web-based utility identifies which build options for compiler and linker to use with oneMKL, depending on the build environment you use and the feature set you want to enable.
Code Samples
Linear Algebra
- Matrix Multiplication with CPUs and GPUs
Use this sample to examine the oneMKL matrix multiplication functionality. - Block Cholesky Decomposition
Learn how to use oneMKL routines for matrix multiplication, rank-k updates, triangular solves (BLAS), and Cholesky factorization (LAPACK).
Migration to SYCL*
- cuBLAS Migration
Use this set of samples to see how cuBLAS routines are transformed to equivalent oneMKL routines after migrating CUDA-based code to SYCL. - Matrix Multiplication cuBLAS Migrated
Learn how to migrate your code to SYCL and use it in a high-performance way, offloading computations to GPU or CPU. See how to optimize the migration steps and improve processing time. - Fourier Correlation
Learn how to implement the Fourier correlation algorithm using SYCL, oneMKL, and Intel® oneAPI DPC++ Library (oneDPL) functions.
For Your Industry
- Finance: Monte Carlo European Options
See how to use the oneMKL random number generator (RNG) functionality to compute European option prices. - Finance: Black-Scholes
Learn how to use vector math and the RNG available in oneMKL to calculate the prices of options using the Black-Scholes formula. - Healthcare: Computed Tomography Reconstruction
Learn how to use discrete Fourier transform (DFT) routines to transform raw computed tomography (CT) data into a reconstructed image of the scanned object.
How to work with code samples:
Training Resources
Understand oneMKL
NEW Intel® oneAPI Math Kernel Library Linking (command-line link tool and link line advisor)
- A Quick Overview of oneMKL) [5:47]
- Simple Use of oneMKL for High Performance [9:48]
- A Vendor-Neutral Path to Math Acceleration
- Implement the Fourier Correlation Algorithm Using oneAPI
- Solve Linear Systems Using oneMKL and OpenMP Target Offloading
- oneMKL Verbose Mode: Quick and Easy GPU Library Execution Profiler
Take Advantage of SYCL
NEW How to Move from CUDA Math Library Calls to oneMKL
Specifications
Processors:
- Intel Atom® processors
- Intel® Core™ processors
- Intel® Xeon® Scalable processors
GPUs:
- Intel® UHD Graphics for 11th generation Intel processors or newer
- Intel® Iris® Xe graphics
- Intel® Arc™ graphics
- Intel® Data Center GPU Flex Series
- Intel® Data Center GPU Max Series
Languages:
- SYCL
- C and C++
- Fortran
For more information, see the system requirements.
Operating systems:
- Windows
- Linux
Compilers:
- Intel® oneAPI DPC++/C++ Compiler
- GNU Compiler Collection (GCC)*
- Intel Fortran Compiler
- Intel Fortran Compiler Classic
- Other compilers that follow the same standards
Development environments:
- Windows: Microsoft Visual Studio*
- Linux: Eclipse* and Eclipse CDT (C/C++ Development Tooling)*
Threading models:
- Intel® oneAPI Threading Building Blocks
- OpenMP
Get Help
Your success is our success. Access these support resources when you need assistance.
Stay in the Know with All Things CODE
Sign up to receive the latest trends, tutorials, tools, training, and more to
help you write better code optimized for CPUs, GPUs, FPGAs, and other
accelerators—stand-alone or in any combination.