OpenMP Offload Best Practices
In this chapter we present best practices for improving the
performance of applications that offload onto the GPU. We organize the
best practices into the following categories, which are described in
the sections that follow:
Note: The following configuration was used when collecting OpenMP
performance numbers:
- Internal versions of the Intel compilers and GPU driver
- GPU: ATS-P B0, 2-Tile
- L0-plugin
- Introduced a dummytargetconstruct at the beginning of a program, so as not to measure startup time.
- Used Just-In-Time (JIT) compilation mode.
- Used 1-Tile only (no implicit or explicit scaling).