Visible to Intel only — GUID: GUID-BA83D3D9-BBF4-4EB5-9345-A36F23CD7B6C
Visible to Intel only — GUID: GUID-BA83D3D9-BBF4-4EB5-9345-A36F23CD7B6C
OpenMP Execution Model
The OpenMP execution model has a single host device but multiple target devices. A device is a logical execution engine with its own local storage and data environment.
When executing on Arctic Sound or Ponte Vecchio, the entire GPU (which is composed of two tiles) can be considered as a device, or each tile can be considered as a device.
OpenMP starts executing on the host. When a host thread encounters a target construct, data is transferred from the host to the device (if specified by map clauses, for example), and code in the construct is offloaded onto the device. At the end of the target region, data is transferred from the device to the host (if so specified).
By default, the host thread that encounters the target construct waits for the target region to finish before proceeding further. nowait on a target construct specifies that the host thread does not need to wait for the target region to finish. In other words, the nowait clause allows the asynchronous execution of the target region.
Synchronizations between regions of the code executing asynchronously can be achieved via the taskwait directive, depend clauses, (implicit or explicit) barriers, or other synchronization mechanisms.