Visible to Intel only — GUID: GUID-A834ADBC-531B-4C2A-8ECF-EA052DDF59C7
Visible to Intel only — GUID: GUID-A834ADBC-531B-4C2A-8ECF-EA052DDF59C7
Terminology
In this chapter, OpenMP and SYCL terminology is used interchangeably to describe the partitioning of iterations of an offloaded parallel loop.
As described in the “SYCL Thread Hierarchy and Mapping” chapter, the iterations of a parallel loop (execution range) offloaded onto the GPU are divided into work-groups, sub-groups, and work-items. The ND-range represents the total execution range, which is divided into work-groups of equal size. A work-group is a 1-, 2-, or 3-dimensional set of work-items. Each work-group can be divided into sub-groups. A sub-group represents a short range of consecutive work-items that are processed together as a SIMD vector.
The following table shows how SYCL concepts map to OpenMP and CUDA concepts.
SYCL |
OpenMP |
CUDA |
---|---|---|
Work-item |
OpenMP thread or SIMD lane |
CUDA thread |
Work-group |
Team |
Thread block |
Work-group size |
Team size |
Thread block size |
Number of work-groups |
Number of teams |
Number of thread blocks |
Sub-group |
SIMD chunk (simdlen = 8, 16, 32) |
Warp (size = 32) |
Maximum number of work-items per work-group |
Thread limit |
Maximum number of of CUDA threads per thread block |