Using
oneMKL Verbose Mode
oneMKL
Verbose ModeWhen building applications that call
functions, it may be useful to determine:
Intel® oneAPI Math Kernel Library
- which computational functions are called
- what parameters are passed to them
- how much time is spent to execute the functions
- (for GPU applications) which GPU device the kernel is executed on
You can get an application to print this information to a standard output device by enabling
Verbose. Functions that can print this information are referred to as
verbose-enabled functions.
Intel® oneAPI Math Kernel Library
When Verbose mode is active in an
domain, every call of a verbose-enabled function finishes with printing a human-readable line describing the call. However, if your application gets terminated for some reason during the function call, no information for that function will be printed. The first call to a verbose-enabled function also prints a version information line.
Intel® oneAPI Math Kernel Library
For GPU applications, additional information (one or more GPU information lines) will also be printed by the first call to a verbose-enabled function, following the version information line which will be printed for the host CPU. If there is more than one GPU detected, each GPU device will be printed in a separate line.
We have different implementations for verbose with CPU applications and verbose with GPU applications. The Intel®
MKL Verbose mode has 2 modes when used with CPU applications: disabled (default) and enabled. The Intel®
MKL Verbose mode has three modes when used with GPU applications: disabled (default), enabled without timing, and enabled with synchronous timing.
To change the verbose mode, either set the environment variable
MKL_VERBOSE
:
CPU application
| GPU application
| |
---|---|---|
Set
MKL_VERBOSE to 0
| to disable Verbose
| to disable Verbose
|
Set
MKL_VERBOSE to 1
| to enable Verbose
| to enable Verbose without timing
|
Set
MKL_VERBOSE to 2
| to enable Verbose
| to enable Verbose with synchronous timing
|
or
call the support function
mkl_verbose(int mode)
:
CPU application
| GPU application
| |
---|---|---|
Call
mkl_verbose(0) | to disable Verbose
| to disable Verbose
|
Call
mkl_verbose(1) | to enable Verbose
| to enable Verbose without timing
|
Call
mkl_verbose(2) | to enable Verbose
| to enable Verbose with synchronous timing
|
Verbose with CPU Applications
Verbose output will be consisted of version information line and call description lines for CPU.
For CPU applications, you can enable
Verbose mode in these domains:
Intel® oneAPI Math Kernel Library
- BLAS (and BLAS-like extensions)
- LAPACK
- ScaLAPACK (selected functionality)
- FFT
Verbose with GPU Applications
The verbose feature is enabled for GPU applications that uses DPC++ API or C/Fortran API with OpenMP offload. When used with GPU applications, verbose allows the measurement of execution time to be enabled or disabled with verbose mode. Timing is taken synchronously, so if verbose is enabled with timing, kernel executions will become synchronous (previous kernel will block later kernels)
Verbose output will be consisted of version information line, GPU information lines, and call description lines for GPU.
Timing for GPU applications is reported for overall execution. For selected functionality device execution time can be also reported if the input queue was created with profiling information.
For GPU applications, you can enable
Verbose mode in these domains:
Intel® oneAPI Math Kernel Library
- BLAS (and BLAS-like extensions)
- LAPACK
- FFT
For Both CPU and GPU Verbose
Both enabling and disabling of the Verbose mode using the function call takes precedence over the environment setting. For a full description of the mkl_verbose function, see either the Intel®
oneAPI Math Kernel Library Developer Reference for C or the Intel®
oneAPI Math Kernel Library Developer Reference for Fortran. Both references are available in the Intel®
Software Documentation Library.
Intel® oneAPI Math Kernel Library
The performance of an application may degrade with the Verbose mode enabled, especially when the number of calls to verbose-enabled functions is large, because every call to a verbose-enabled function requires an output operation.