Visible to Intel only — GUID: GUID-4B60BCA6-1948-4EB0-A9F3-D56AA1C683F7
Visible to Intel only — GUID: GUID-4B60BCA6-1948-4EB0-A9F3-D56AA1C683F7
OpenMP* Support Libraries
The Intel® oneAPI DPC++/C++ Compiler provides support libraries for OpenMP*. There are several kinds of libraries:
Performance: supports parallel OpenMP execution.
Stubs: supports serial execution of OpenMP applications.
Each kind of library is available for both dynamic and static linking on Linux* operating systems. Only dynamic linking is supported on Windows* operating systems.
Performance Libraries
To use these libraries, specify the /Qopenmp (Windows*) or -qopenmp (Linux*) option.
Options that use OpenMP are available for both Intel® and non-Intel microprocessors, but these options may perform additional optimizations on Intel® microprocessors than they perform on non-Intel microprocessors. The list of major, user-visible OpenMP constructs and features that may perform differently on Intel® microprocessors than on non-Intel microprocessors includes: locks (internal and user visible), the SINGLE construct, barriers (explicit and implicit), parallel loop scheduling, reductions, memory allocation, and thread affinity and binding.
Operating System |
Dynamic Link |
Static Link |
---|---|---|
Linux |
libiomp5.so |
libiomp5.a |
Windows |
libiomp5md.lib |
None |
Many routines in the OpenMP support libraries are more optimized for Intel® microprocessors than for non-Intel microprocessors.
Stubs Libraries
To use these libraries, specify /Qopenmp-stubs (Windows*) or -qopenmp-stubs (Linux*) option. These allow you to compile OpenMP applications in serial mode and provide stubs for OpenMP routines and extended Intel-specific routines.
Operating System |
Dynamic Link |
Static Link |
---|---|---|
Linux |
libiompstubs5.so |
libiompstubs5.a |
Windows |
libiompstubs5md.lib |
None |
Execution Modes
The compiler enables you to run an application under different execution modes specified at runtime; the libraries support the turnaround, throughput, and serial modes. Use the KMP_LIBRARY environment variable to select the modes at runtime.
Mode |
Description |
---|---|
throughput (default) |
The throughput mode allows the program to yield to other running programs and adjust resource usage to produce efficient execution in a dynamic environment. In a multi-user environment where the load on the parallel machine is not constant or where the job stream is not predictable, it may be better to design and tune for throughput. This minimizes the total time to run multiple jobs simultaneously. In this mode, the worker threads yield to other threads while waiting for more parallel work. After completing the execution of a parallel region, threads wait for new parallel work to become available. After a certain period of time has elapsed, they stop waiting and sleep. Until more parallel work becomes available, sleeping allows processor and resources to be used for other work by non-OpenMP threaded code that may execute between parallel regions, or by other applications. The amount of time to wait before sleeping is set either by the KMP_BLOCKTIME environment variable or by the kmp_set_blocktime() function. A small blocktime value may offer better overall performance if your application contains non-OpenMP threaded code that executes between parallel regions. A larger blocktime value may be more appropriate if threads are to be reserved solely for use for OpenMP execution, but may penalize other concurrently-running OpenMP or threaded applications. |
turnaround |
The turnaround mode is designed to keep active all processors involved in the parallel computation, which minimizes execution time of a single job. In this mode, the worker threads actively wait for more parallel work, without yielding to other threads (although they are still subject to KMP_BLOCKTIME control). In a dedicated (batch or single user) parallel environment where all processors are exclusively allocated to the program for its entire run, it is most important to effectively use all processors all of the time.
NOTE:
Avoid over-allocating system resources. The condition can occur if either too many threads have been specified, or if too few processors are available at runtime. If system resources are over-allocated, this mode will cause poor performance. The throughput mode should be used instead if this occurs. |
serial |
The serial mode forces parallel applications to run as a single thread. |