Intel® oneAPI Math Kernel Library (oneMKL) Release Notes

Where to Find the Release

Intel® oneAPI Math Kernel Library

2025.3.0

System Requirements Bug Fix Log

What's New in All Domains

oneMKL 2025.3 includes support for Xe3 integrated GPUs.
OpenMP* offload functionality was updated to conform to the OpenMP* 6.0 standard.

New Features and Optimizations

Sparse BLAS
- Features
  - New Inspector-Executor C/Fortran APIs for conversion between dense matrix representation and sparse matrix format
    - APIs and descriptions
      - mkl_sparse_?_convert_dense: conversion from any of the supported sparse matrix formats (CSR, CSC, COO and BSR) to dense representation
      - mkl_sparse_?_convert_dense2csr: conversion from dense representation to CSR sparse matrix format
      - mkl_sparse_?_convert_dense2csc: conversion from dense representation to CSC sparse matrix format
      - mkl_sparse_?_convert_dense2coo: conversion from dense representation to COO sparse matrix format
      - mkl_sparse_?_convert_dense2bsr: conversion from dense representation to BSR sparse matrix format
    - The declarations for those routines can be found in include file mkl_spblas.h. The use of those routines is demonstrated in the following examples
      - C:
        
        examples/c/sparse_blas/source/sparse_convert_dense.c
        
        examples/c/sparse_blas/source/sparse_dense2bsr.c
      - Fortran:
        
        examples/f/sparse_blas/source/sparse_convert_dense.f90
        
        examples/f/sparse_blas/source/sparse_dense2bsr.f90
  - New SYCL API to create a CSC matrix:
    - sparse::set_csc_data(), a routine that takes a sparse::matrix_handle_t for a sparse matrix of dimensions nrows-by-ncols represented in the CSC format and fills the internal state of the matrix handle with the user provided arrays in CSC format. The declaration for this routine can be found in the include file, oneapi/mkl/spblas/sparse_structures.hpp. The use of the new sparse::set_csc_data is demonstrated in the following examples:
      - examples/sycl/sparse_blas/source/csc_gemv.cpp
      - examples/sycl/sparse_blas/source/csc_gemv_usm.cpp.
    - So far, only the sparse::gemv routine supports a CSC matrix, but it may be expanded in future releases to other execution routines. Execution APIs that do not yet support CSC format will throw an unimplemented oneMKL exception and return.
  - New SYCL API to create a BSR sparse matrix:
    - sparse::set_bsr_data(), a routine that takes user provided sparse matrix data in the BSR matrix format with square or rectangular dense blocks and fills the internal state of the provided sparse::matrix_handle_t with that matrix data. The declaration for this routine can be found in the include file, oneapi/mkl/spblas/sparse_structures.hpp. The use of the new sparse::set_bsr_data API is demonstrated in the following example:
      - examples/sycl/sparse_blas/source/bsr_gemv_usm.cpp
    - So far, only the sparse::gemv routine with non-transpose operation supports a BSR matrix data, but it may be expanded in future releases to other execution routines. Execution APIs that do not yet support BSR format will throw an unimplemented MKL exception and return.
  - Sparse Triangular Solve SpTRSV and SpTRSM APIs in the deprecated NIST Sparse BLAS, and in Inspector-Executor Sparse BLAS and SYCL Sparse BLAS routines have been modified internally to not check for missing or zero diagonal values, but rather now align with the behavior of dense BLAS TRSV/TRSM where NaN or Inf will be generated from division-by-zero if a matrix diagonal value is missing (implicitly zero) or is explicitly zero. Previously, we would check and provide status or exception indicating an invalid value, but the overhead of doing such checks is high and, in our estimation, unnecessary.
  - Sparse BLAS APIs for putting format data into matrix handles have been updated for more strict input argument handling. In particular,
    - Inspector-Executor Sparse BLAS C/Fortran routines:
      - for nrows <= 0 and ncols <=0, the Inspector Executor routines (mkl_sparse_?_create_xyz) will now return SPARSE_STATUS_INVALID_VALUE.
    - Sparse BLAS SYCL routines:
      - for nrows <= 0 and ncols <=0, the SYCL routines (sparse::set_xyz_data) will now throw oneapi::mkl::invalid_argument with a more complete error message.
      - Additional checking for various formats related to partial creation of matrix handles to allow cases that show up from sparse::matmat or sparse::omatadd, etc., where a sparse matrix is constructed in stages.
  - A new overload of sparse::set_csr_data has been added that lets users pass in the “nnz” value (referring to the number of stored elements) on host through the API. The newly introduced sparse::set_csc_data and sparse::set_bsr_data also have this along with sparse::set_coo_data where it was required before. The existing overload of sparse::set_csr_data without the nnz has been marked as deprecated in 2025.3 release and should be removed in the major release 2027.0.
    - Motivation for the change: In the increasingly heterogeneous compute world, and since users almost always knowing the nnz size as they have allocated it and it requiring extra work to retrieve the nnz from the input row pointer array which may live on some accelerator, we chose to have the “nnz” parameter be part of the format definition in the SYCL APIs.
- Optimizations
  - Improved performance of mkl_sparse_?_trsv and mkl_sparse_?_symgs for single, complex single and complex double data types for Intel® CPUs with AVX2 or AVX512 ISA available.

LAPACK
- Features
  - New Fortran OpenMP* offload interfaces for Intel® GPU were introduced for getri_oop_batch (group), getri_oop_batch_strided, getrfnp_batch (group).
- Optimizations
  - Improved performance of LAPACK generalized eigensolvers, {s,d}ggev, {s,d}ggev3, {s,d}gges3 for both sequential and OpenMP* threading.
  - Improved performance of geqrf for tall-skinny complex matrices on Intel® CPU.
  - Improved performance of batch Cholesky factorization (potrf_batch) for Intel® Arc™ A-series GPUs.
  - Improved performance of SVD and least squares solvers for complex precisions, and TRTRI for all precisions on Intel® CPUs.
DFT
- Features
  - Improved exception handling of the SYCL DFT APIs, especially for non-supported dimensions.
- Optimizations
  - Improved performance of 2/3D real and complex FFT of size between 2^11 and 2^21 on Intel® Data Center GPU Max Series.
Vector Statistics
- Optimizations
  - Improved performance of mrg32k3a basic random number generator on Intel® CPUs with AVX512 ISA available.
Sparse Solvers
- Optimizations
  - Improved load balancing when running on multiple threads in the factorization phase of oneMKL Pardiso.
Library Engineering
- Features
  - Frame pointers have been enabled for the host part of the SYCL-based library.

Known Issues and Limitations

oneMKL DFT SYCL APIs using SYCL buffer for data input do not support SYCL sub-buffer inputs for large power-of-two sizes [2²¹,2²⁶] 1D complex FFT.
oneMKL FFT with a large prime factor (greater than 1024) may fail on Intel® Data Center GPU Max Series.
oneMKL DFT SYCL APIs don’t support negative strides and distances.
oneMKL SYCL functionality requires OpenCL* runtime when using the Level Zero SYCL backend on Intel® GPUs.
oneMKL LAPACK direct call routines on Windows with TBB threading may crash. OpenMP threading is recommended.
oneMKL SYCL Sparse BLAS with recently added BSR sparse matrix format only supports use of USM matrix data for CPUs; sycl::buffers are not supported on SYCL CPU.
The Intel® optimized HPCG benchmark for CPU shipped with Intel® oneMKL 2025.3 release will exit early with a runtime error message in the logs of “Error in mkl_sparse_d_create_csr: failed to create sparse matrix B.” when run with 1 MPI process or without MPI. It will work as expected for more than 1 MPI process. If you observe this issue, try using more than 1 MPI process in your run. A fix will be provided in the next release to support all configurations.
In 2025.2 and 2025.3 releases, oneMKL Sparse BLAS APIs mkl_sparse_convert_csr, mkl_sparse_convert_csc and mkl_sparse_convert_bsr fail to reduce duplicate elements when converting from COO format. For converting from COO with duplicate elements, you may use the same APIs from release 2025.1 or before or write your own summation routine to handle removal of duplicates yourself. A fix will be added in 2026.0 release to include such duplication removals in conversion from COO format.

Deprecation

The oneMKL static SYCL library is deprecated from the 2025.2 release and will be removed in the 2026.0 release. Users are encouraged to use the oneMKL dynamic SYCL domain-specific libraries instead. For guidance on linking against oneMKL, please refer to the Link Line Advisor.
The NIST Sparse BLAS Level 2 and Level 3 APIs have been marked as deprecated since at least 2018. They will be officially removed from oneMKL product in the 2026.0 release. New conversion APIs between sparse formats and between sparse and dense formats have been added in oneMKL 2025.3.
Support for the OpenCL* backend on Intel® GPU has been deprecated and will be removed in the oneMKL 2026.0 release.
oneMKL vars scripts will stop appending CPATH starting from the oneMKL 2026.0 release. IFORT users will need to append CPATH manually if required. Other users will not be affected.
The existing overload of sparse::set_csr_data without the nnz has been marked as deprecated in 2025.3 release and should be removed in the major release 2027.0.

2025.2.0

System Requirements Bug Fix Log

New Features and Optimizations

Sparse BLAS
- Features
  - Introduced new routines optimize_gemm for SYCL. The function prepares sparse matrix data to enable more efficient execution of the GEMM.
  - Introduced new routines mkl_sparse_convert_coo and mkl_sparse_convert_csc for C/Fortran. The functions convert internal matrix representation to COO format and CSC format respectively.
  - Introduced new routine mkl_sparse_?_export_coo for C/Fortran. The function exports COO matrix from internal representation.
- Optimizations
  - Improved performance for IE Sparse BLAS APIs for CSR SpMV(mkl_sparse_?_mv)/SpMM (mkl_sparse_?_mm) and COO SpMV(mkl_sparse_?_mv) on Intel CPUs.
  - Improved performance for SYCL Sparse BLAS APIs with COO SpMV/SpMM (sparse::gemv/sparse::gemm) on Intel® Iris® X^e MAX graphics, Intel® Arc™ GPU, Intel® Data Center GPU and other Intel GPUs.

LAPACK
- Features
  - Introduced exception handling in batch strided factorization routines: getrf_batch_strided, getrfnp_batch_strided, and potrf_batch_strided for singular matrices.
  - Introduced new routines and integrated bug fixes from Netlib LAPACK 3.12.1. oneMKL LAPACK functionality is now aligned with Netlib LAPACK 3.12.1.
  - Introduce the number of theads setting for LAPACK.
- Optimizations
  - Improved performance in dsyevd eigensolver.
  - Improved performance of geqrf on CPU architectures.
  - Improved performance for getrf_batch and potrf_batch_strided routines on Intel® Data Center GPU Max Series.
DFT 
- Features
  - Introduced distributed SYCL DFT API with Intel® Data Center GPU Max Series support for computing 2/3D non-batch FFT on multiple GPUs.
  - Enable the move operator in the SYCL DFT descriptor class.
  - Improved error catching for the C/Fortran DFT OpenMP offload APIs.
- Optimizations
  - Improved performance of small (length in each dimension up to 64) 2D complex FFT on Intel® Data Center GPU Max. Series, Intel® Arc™ B-series graphics, and GPUs on Intel® Core™ Ultra Processors (Series 2)( Products formerly Lunar Lake).
  - Improved performance on Intel® Data Center GPU Max Series, Intel® Arc™ B-series graphics, and GPUs on Intel® Core™ Ultra Processors (Series 2)( Products formerly Lunar Lake) of 1/2/3D complex and real FFTs for sizes not factorizable as two factors less than 64.
  - Improved performance of odd size real batched 1D FFTs on Intel® Iris® X^e MAX graphics, Intel® Arc™ GPU, Intel® Data Center GPU and other Intel GPUs.
Vector Statistics
- Features
  - Introduces counter_engine_adaptor and (u)int8/16 types support in SYCL RNG Device API.
  - Introduced support for SkipAhead parallelization method for MT2203 engine (C/Fortran/SYCL API).
  - Introduced ne permuted congruential generator(PCG) Device API engine.
- Optimizations
  - Improved performance of generalized feedback shift register generator(R250).
Sparse Solvers
- Optimizations
  - Improved performance in the factorization phase of PARDISO.
Library Engineering:
- Features
  - Completed Control Flow Guard support on Windows for oneMKL and Custom Dynamic-link Library builder.

Known Issues and Limitations

oneMKL SYCL DLL could leak memory after unloading on Windows*. The problem can be avoided by adding mkl_free_buffer before unloading the DLL.
oneMKL FFT may raise a segmentation fault for small real FFTs when used with 6 threads on CPU. As a workaround, use another number of threads.
oneMKL DFT SYCL* APIs using SYCL* buffer for data input do not support SYCL* sub-buffer inputs for a range of large power of two sizes [2²¹,2²⁶] 1D complex FFT.
oneMKL FFT with a large prime factor (larger than 1024) may fail on Intel® Data Center GPU Max Series.
oneMKL DFT SYCL* APIs don't support negative strides or distances.
oneMKL LAPACK cgesvd and zgesvd perform not best on CPUs supporting AVX512 instructions for matrices with a very small number of columns (N<150) and many more rows than columns.
oneMKL LAPACK cgels and zgels perform not best on CPUs supporting AVX512 instructions for matrices with many more rows than columns.
oneMKL BLAS gemm calls using BF16 or integer precision data might not achieve best performance on Intel® Arc™ B-Series Graphics discrete GPUs.
oneMKL BLAS gemm calls using OpenMP offload interface might not achieve best performance on Intel® Arc™ B-series graphics discrete GPUs . The SYCL interfaces are recommended for these cases.
oneMKL SYCL* functionality requires OpenCL* runtime in case of Level Zero SYCL* backend on Intel GPUs.
Some oneMKL functions may crash on Windows* with dynamic linking when Control Flow Guard (CFG), /guard:cf, is enabled at both compile time and link time. As a workaround, use static linking or disable CFG at link time.

Deprecation

The oneMKL static SYCL library is deprecated and will be removed in the oneMKL 2026.0 release. Users are encouraged to use the oneMKL dynamic SYCL domain-specific libraries instead. For guidance on linking against oneMKL, please refer to the Link Line Advisor.
Support for OpenCL* backend is deprecated and will be removed in future releases.

2025.1.0

System Requirements Bug Fix Log

New Features and Optimizations

BLAS
- Features
  - Introduction of GEMM APIs with support for 8-bit floating point numbers in both the e4m3 and e5m2 variants.
- Optimizations
  - Improved performance of single precision GEMM operations with small m and n, large k on the Graphics for Intel® Core™ Ultra Processors (Series 1).

Sparse BLAS
- Optimizations
  - Improved performance of mkl_sparse_?_mv API on Intel® Advanced Vector Extensions 2 (Intel® AVX2) architecture for the BSR and CSR matrix formats in many cases.
  - Improved performance of sparse::gemv and sparse::trsm SYCL APIs for some workloads on Intel GPUs with the CSR format.
LAPACK
- Features
  - Introduced new routines and integrated bug fixes from Netlib LAPACK 3.12.0. New functionality includes the Dynamic Mode Decomposition routines and truncated QR with column pivoting. oneMKL LAPACK functionality is now aligned with Netlib LAPACK 3.12.0.
- Optimizations
  - Improved the performance of double precision LU and QR factorizations on Intel® Xeon® processors with Intel® Advanced Vector Extensions 512 (Intel® AVX-512) architecture and high thread count.
  - Improved the performance of Batched group LU factorization on Intel GPUs for use cases that have a large number of groups.
DFT 
- Optimizations
  - Improved 1/2/3D real and complex FFT performance on Intel® Arc™ B-series Graphics, Graphics for Intel® Core™ Ultra Processors (Series 2) and Intel® Data Center GPU Max Series when the length in the last dimension can be factorized as 3 or 4 small (<64) factors and does not exceed 8192, 16384, 16384 and 32768 for double precision complex FFT, single precision complex FFT, double precision real FFT and single precision real FFT, respectively.
Vector Math
- Features
  - Added GPU kernel implementations for LA (low accuracy) and EP (enhanced performance accuracy) versions for several functions: single precision exp, exp2, exp10 and double precision log10.
- Optimizations
  - Improved performance of the CPU implementations of single and double precision real VM functions, ln, pow, sin, erf, tanh and atan2, for all accuracy versions (HA, LA, and EP).
  - Improved performance of the CPU implementations of single and double precision complex VM functions, sin, asin, asinh, cos, acos, acosh, tanh, and atanh, for all accuracy versions.
Vector Statistics
- Features
  - Introduced Geometric distribution support in RNG Device API.
Sparse Solvers
- Optimizations
  - The performance of the numerical factorization algorithm in the PARDISO solvers has been improved for a variety of workloads.
ScaLAPACK
- Features
  - Aligned oneMKL ScaLAPACK functionality and behavior with Netlib ScaLAPACK version 2.2.2.
Library Engineering:
- Features
  - Multiple new macros and enums are introduced in the header files to query oneMKL’s compliance with the oneMATH specification. These macros are domain-specific. Refer to oneMKL Developer Guide for details and usage examples.

Known Issues and Limitations

oneMKL SYCL* functionality requires OpenCL* runtime in case of Level Zero SYCL* backend on Intel GPUs.
oneMKL DFT SYCL* APIs using SYCL* buffer for data input do not support SYCL* sub-buffer inputs for a range of large power of two sizes [2²¹,2²⁶] 1D complex FFT.
oneMKL FFT with a large prime factor (larger than 1024) may fail on Intel® Data Center GPU Max Series.
Negative strides and distances are not supported with the oneMKL DFT SYCL* APIs.
oneMKL FFT may crash on the Graphics for Intel® Core™ Ultra Processors Series 1 with Windows* when using the C or Fortran OpenMP* offload APIs with the OpenCL* runtime. Use the Level Zero runtime instead on these platforms.
There are sporadic segmentation faults for gemm and gemmt routines using double precision on Intel® Data Center GPU Max Series.
OpenMP offload of sormqr and cunmqr routines on Intel® Arc™ A-series GPUs with the OpenCL* backend may produce wrong results. Use the Level Zero backend instead on this platform.
ScaLAPACK symmetric or Hermitian eigenvalue solvers p{he|sy}evr may fail if the number of MPI ranks is larger than the number of computed eigenvalues.
Develop packages installation/update with APT/YUM might cause warning about libmkl_sycl.so is not an ELF file. This is expected behavior as, starting from 2024.0, mkl_sycl library is split into domain-specific libraries. This is particularly helpful when redistributing oneMKL with only specific domain functionality. libmkl_sycl.so is not an ELF file but it is a linker script which links the application with all oneMKL supported domains using the INPUT command:
INPUT(-lmkl_sycl_blas -lmkl_sycl_lapack -lmkl_sycl_sparse -lmkl_sycl_dft -lmkl_sycl_vm -lmkl_sycl_rng -lmkl_sycl_stats -lmkl_sycl_data_fitting)
Hence, the ASCII file type is expected. Your installation will contain the files listed above within the INPUT command. More on linker scripts and commands: Using LD, the GNU linker - Command Language.

2025.0.1

System Requirements

This is a bugfix release.

Fixed Issues

Fixed issues in some BLAS functions on AMD hardware in Windows*.
Fixed accumulated execution time in subsequent OpenMP* offload calls.
The runme scripts for the Intel® Optimized LINPACK* Benchmark and Intel® Distribution for LINPACK* Benchmark have been replaced with documentation to resolve security issues.

Optimization

Improved OpenMP* offload performance in LAPACK.

2025.0.0

System Requirements Bug Fix Log

CET Support

CET support has been enabled for all oneMKL domains. For more information about CET, please refer to the following article, A Technical Look at Intel’s Control-flow Enforcement Technology.

New Features and Optimizations

BLAS
- Features
  - New out-of-place TRMM and TRSM variants are available for C and Fortran, including support for OpenMP* offload to Intel® GPUs.
- Optimizations
  - Improved cblas_gemm_s8u8s32 and cblas_gemm_bf16bf16f32 performance for large problem size on Intel® Advanced Matrix Extensions (Intel® AMX) architecture.

Sparse BLAS
- Features
  - New API for computing addition of two sparse matrices, oneapi::mkl::sparse::omatadd, is available for SYCL with support for CSR sparse matrix format.
  - New out-of-place sparse matrix conversion API, oneapi::mkl::sparse::omatconvert, is available for SYCL with support for conversion between the CSR and COO sparse matrix formats.
- Optimizations
  - Improved performance for BSR/CSR oneMKL Inspector-Executor Sparse BLAS C/Fortran APIs, mkl_sparse_?_mv and mkl_sparse_?_mm with AVX512 ISA.
LAPACK
- Features
  - Enabled C/Fortran OpenMP* offload support for least squares solver (?gels).
  - Enabled Fortran OpenMP* offload support for batched group LU factorization (?getrf_batch).
  - Added support for mkl_progress in Relatively Robust Representations eigensolver (?syevr).
  - Introduced support for least squares using QR/LQ with T matrix (?gelst) to LAPACK95 interfaces.
  - Updated LAPACK SYCL* USM APIs to be const correct for input arrays.
- Optimizations
  - Improved performance for double real/complex precision expert eigensolver (lapack::syevx, lapack::heevx) and generalized eigensolver (lapack::sygvx, lapack::hegvx) USM APIs on Intel® Data Center GPU Max Series as well as for C and Fortran OpenMP* offloading (dsyevx, zheevx, dsygvx, zhegvx).
  - Improved performance for batched group LU factorization (lapack::getrf_batch) on Intel® Data Center GPU Max Series, for multiple groups when all matrix sizes are <= 96.
  - Improved performance for batched group LU solve (lapack::getrs_batch) on Intel® GPUs, for multiple groups when all matrix sizes are <=32.
  - Improved performance of eigensolver (?syev, ?heev) on CPU for very large matrices (n>100K).
DFT 
- Features
  - Introduced new type-safe SYCL* DFT APIs.
Vector Math
- Features
  - Improved the oneMKL VM exception reporting mechanism, for certain functions which were not raising the overflow exceptions (e.g., vAdd, vSub, vMul).
- Optimizations
  - Improved performance for 6 functions on Intel® Xeon® 6 Processors with Efficient-Cores (vdAcosh_EP, vsExpm1_HA, vsHypot_LA,  vdRound_LA, vdRound_EP, vdRound_HA) by adjusting the unrolling factors.

Vector Statistics
- Features
  - Introduced Beta and Gamma distributions support in RNG Device API.
  - Introduced uint64_t type support for uniform distribution in RNG Device API.
  - Introduced (u)int8/int16 types support for Bernoulli distribution in RNG Device API.
Sparse Solvers
- Features
  - Improved iterative refinement feature of oneMKL PARDISO and added optional printing of iterative refinement information.
- Optimizations
  - Improved performance of oneMKL PARDISO phase 1.

Library Engineering:
- Features
  - A new macro __INTEL_MKL_PATCH__ and a new field PatchVersion in MKLVersion structure are introduced for oneMKL patch version. In addition, the existing macro INTEL_MKL_VERSION now follows a new format (__INTEL_MKL__ * 100 + __INTEL_MKL_UPDATE__) * 100 + __INTEL_MKL_PATCH__, which implies that INTEL_MKL_VERSION will be 20250100 in oneMKL 2025.1.

Known Issues and Limitations

Some BLAS and LAPACK functions may encounter runtime errors on AMD hardware in Windows*. The fix is available starting with oneMKL version 2025.0.1.
oneMKL SYCL* functionality requires OpenCL* runtime in case of Level Zero SYCL* backend on Intel GPUs.
BLAS gemm_batch_span may fail with complex double precision data on the Graphics for Intel® Core™ Ultra 200S series processor.
BLAS gemm may produce wrong results for small matrices on the Graphics of Intel® Core™ Ultra Processors Series 2 if the beginning of the matrix data is not aligned to a 64-bit boundary.
OpenMP offload of some BLAS functions may hang or crash when using the OpenCL* backend on Intel® Arc™ A-Series Graphics. It is recommended to use the Level Zero backend in this case.
Performance regressions may be observed with this release compared to 2024.1 or older releases on AVX2 and older CPU architectures in C/Fortran Inspector-Executor Sparse BLAS oneMKL routines.
oneMKL DFT SYCL* APIs using SYCL* buffer for data input do not support SYCL* sub-buffer inputs for a range of large power of two sizes [2²¹,2²⁶] 1D complex FFT.
oneMKL FFT with a large prime factor (larger than 1024) may fail on Intel® Data Center GPU Max Series.
Negative strides and distances are not supported with the oneMKL DFT SYCL* APIs.
Some BLAS and FFT problems may crash on Intel® Arc™ B-Series Graphics with Linux when using the SYCL* buffer APIs. Use the SYCL* USM API instead on these platforms.
oneMKL FFT may crash on the Graphics for Intel® Core™ Ultra Processors Series 1 with Windows* when using the C or Fortran OpenMP* offload APIs with the OpenCL* runtime. Use the Level Zero runtime instead on these platforms.
Inspector-Executor Sparse BLAS API mkl_sparse_?_mm() may give incorrect results when used with BSR format and column major blocks with block_size >= 6.
Some C/Fortran OpenMP* offload examples are known to fail with oneMKL on Intel® Arc™ B-Series Graphics under Windows* when run in Debug mode due to a driver issue. Please use Release mode for this functionality on Intel® Arc™ B-Series Graphics under Windows*.
The forward/backward solve phase (phase 3) of PARDISO behaves incorrectly for some matrices. The workaround is to set iparm[7] to 2 in C or iparm(8) to 2 in Fortran.

Deprecation

oneMKL support for Cloudera Distribution Channel has been deprecated since 2025.0 and will be removed starting from 2026.0.
The INPUT_STRIDES and OUTPUT_STRIDES configuration parameters have been deprecated for the oneMKL SYCL* DFT APIs since the 2024.1 release, and will be removed in the oneMKL 2026.0 release. Please use the FWD_STRIDES and BWD_STRIDES configuration parameters instead.
The variadic set_value and get_value member function of the oneapi::mkl::dft::descriptor class has been deprecated and will be removed in the oneMKL 2026.0 release. Use the new non-variadic functions instead.
The oneapi/mkl/dfti.hpp header file has been deprecated and will be removed in the oneMKL 2026.0 release. Use the newly introduced oneapi/mkl/dft.hpp header file instead.
oneapi::mkl::dft::config_param::VERSION has been deprecated and will be removed in the oneMKL 2026.0 release. Use MKL_Get_Version_String function instead.
The CONJUGATE_EVEN_STORAGE and PACKED_FORMAT values have been deprecated from the oneapi::mkl::dft::config_param enum class and the COMPLEX_REAL, CCE_FORMAT, PERM_FORMAT, PACK_FORMAT and CCS_FORMAT have been deprecated from the oneapi::mkl::dft::config_value enum class. They will be removed in the oneMKL 2026.0 release.

Removal

Support for the target variant dispatch construct in the Intel Extensions to OpenMP* has been removed. Users should use OpenMP* specification syntax dispatch.
The NUMBER_OF_USER_THREADS, TRANSPOSE, ORDERING and REAL_STORAGE values have been removed from the oneapi::mkl::dft::config_param enum class. The corresponding DFTI_ORDERED, DFTI_BACKWARD_SCRAMBLED, and DFTI_NONE have been removed from the set of possible configuration values for the SYCL* DFT APIs.
The variants of oneapi::mkl::sparse::set_csr_data() and oneapi::mkl::sparse::release_matrix_handle() without a sycl::queue as an argument which were deprecated in the 2023.0 release have been removed in the 2025.0 release from Sparse BLAS.
The undocumented LAPACK routines {S,D}COMBSSQ, which were removed from Netlib LAPACK 3.10.1 and deprecated in the oneMKL 2023.0 release, have been removed in the oneMKL 2025.0 release.
Previously deprecated std::vector based constructors have been removed for gaussian_mv, multinomial and poisson_v Host API random number distributions.

Previous oneAPI Releases

Notices and Disclaimers

Intel technologies may require enabled hardware, software or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Math Kernel Library (oneMKL) Release Notes

Where to Find the Release

2025.3.0

What's New in All Domains

New Features and Optimizations

Sparse BLAS

Features

LAPACK

Features

Optimizations

DFT

Features

Optimizations

Vector Statistics

Optimizations

Sparse Solvers

Optimizations

Library Engineering

Features

Known Issues and Limitations

Deprecation

2025.2.0

New Features and Optimizations

Sparse BLAS

Features

Optimizations

LAPACK

Features

Optimizations

DFT

Features

Optimizations

Vector Statistics

Features

Optimizations

Sparse Solvers

Optimizations

Library Engineering:

Features

Known Issues and Limitations

Deprecation

2025.1.0

New Features and Optimizations

BLAS

Features

Optimizations

Sparse BLAS

Optimizations

LAPACK

Features

Optimizations

DFT

Optimizations

Vector Math

Features

Optimizations

Vector Statistics

Features

Sparse Solvers

Optimizations

ScaLAPACK

Features

Library Engineering:

Features

Known Issues and Limitations

2025.0.1

Fixed Issues

Optimization

2025.0.0

CET Support

New Features and Optimizations

BLAS

Features

Optimizations

Sparse BLAS

DFT 

DFT 

DFT