Developer Guide
Developer Guide for Intel® oneAPI Math Kernel Library Linux*
ID
766690
Date
11/07/2023
Public
A newer version of this document is available. Customers should click here to go to the newest version.
Getting Help and Support
What's New
Notational Conventions
Related Information
Getting Started
Structure of the Intel® oneAPI Math Kernel Library
Linking Your Application with the Intel® oneAPI Math Kernel Library
Managing Performance and Memory
Language-specific Usage Options
Obtaining Numerically Reproducible Results
Coding Tips
Managing Output
Working with the Intel® oneAPI Math Kernel Library Cluster Software
Managing Behavior of the Intel® oneAPI Math Kernel Library with Environment Variables
Configuring Your Integrated Development Environment to Link with Intel® oneAPI Math Kernel Library
Intel® oneAPI Math Kernel Library Benchmarks
Appendix A: Intel® oneAPI Math Kernel Library Language Interfaces Support
Appendix B: Support for Third-Party Interfaces
Appendix C: Directory Structure in Detail
Notices and Disclaimers
OpenMP* Threaded Functions and Problems
Functions Threaded with Intel® Threading Building Blocks
Avoiding Conflicts in the Execution Environment
Techniques to Set the Number of Threads
Setting the Number of Threads Using an OpenMP* Environment Variable
Changing the Number of OpenMP* Threads at Run Time
Using Additional Threading Control
Calling oneMKL Functions from Multi-threaded Applications
Using Intel® Hyper-Threading Technology
Managing Multi-core Performance
Managing Performance with Heterogeneous Cores
Overview of the Intel® Distribution for LINPACK* Benchmark
Overview of the Intel® Optimized HPL-AI* Benchmark
Contents of the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark
Building the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark for a Customized MPI Implementation
Building the Netlib HPL from Source Code
Configuring Parameters
Ease-of-use Command-line Parameters
Running the Intel® Distribution for LINPACK* Benchmark and the Intel® Optimized HPL-AI* Benchmark
Heterogeneous Support in the Intel® Distribution for LINPACK* Benchmark
Environment Variables
Improving Performance of Your Cluster
Example of Data Alignment
Needs for best performance with Intel® oneAPI Math Kernel Library (oneMKL) or for reproducible results from run to run of Intel® oneAPI Math Kernel Library (oneMKL) functions require alignment of data arrays. The following example shows how to align an array on 64-byte boundaries. To do this, usemkl_malloc() in place of system provided memory allocators, as shown in the code example below.
Aligning Addresses on 64-byte Boundaries
// ******* C language *******
...
#include <stdlib.h>
#include <mkl.h>
...
void *darray;
int workspace;
// Set value of alignment
int alignment=64;
...
// Allocate aligned workspace
darray = mkl_malloc( sizeof(double)*workspace, alignment );
...
// call the program using oneMKL
mkl_app( darray );
...
// Free workspace
mkl_free( darray );
! ******* Fortran language *******
...
! Set value of alignment
integer alignment
parameter (alignment=64)
...
! Declare oneMKL routines
#ifdef _IA32
integer mkl_malloc
#else
integer*8 mkl_malloc
#endif
external mkl_malloc, mkl_free, mkl_app
...
double precision darray
pointer (p_wrk,darray(1))
integer workspace
...
! Allocate aligned workspace
p_wrk = mkl_malloc( %val(8*workspace), %val(alignment) )
...
! call the program using oneMKL
call mkl_app( darray )
...
! Free workspace
call mkl_free(p_wrk)
Parent topic: Coding Tips