DPC++ Device Selection in the Host Code
Host code can explicitly select a device type. To do select a device, select a queue and initialize its device with one of the following:
If default_selector is used, the kernel runs based on a heuristic that chooses from available compute devices (all, or a subset based on the value of the SYCL_DEVICE_FILTER environment variable).
If a specific device type (such as cpu_selector or gpu_selector) is used, then it is expected that the specified device type is available in the platform or included in the filter specified by SYCL_DEVICE_FILTER. If such a device is not available, then the runtime system throws an exception indicating that the requested device is not available. This error can be thrown in the situation where an ahead-of-time (AOT) compiled binary is run in a platform that does not contain the specified device type.
While DPC++ applications can run on any supported target hardware, tuning is required to derive the best performance advantage on a given target architecture. For example, code tuned for a CPU likely will not run as fast on a GPU accelerator without modification.
SYCL_DEVICE_FILTER is a complex environment variable that allows you to limit the runtimes, compute device types, and compute device IDs that may be used by the DPC++ runtime to a subset of all available combinations. The compute device IDs correspond to those returned by the SYCL API, clinfo, or sycl-ls (with the numbering starting at 0). They have no relation to whether the device with that ID is of a certain type or supports a specific runtime. Using a programmatic special selector (like gpu_selector) to request a filtered out device will cause an exception to be thrown. Refer to the environment variable description in GitHub for details on use and example values: https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md.
The sycl-ls tool enumerates a list of devices available in the system. It is strongly recommended to run this tool before running any SYCL or DPC++ programs to make sure the system is configured properly. As a part of enumeration, sycl-ls prints the SYCL_DEVICE_FILTER string as a prefix of each device listing. The format of the sycl-ls output is [SYCL_DEVICE_FILTER] Platform_name, Device_name, Device_version [driver_version]. In the following example, the string enclosed in the bracket ([ ]) at the beginning of each line is the SYCL_DEVICE_FILTER string used to designate the specific device on which the program will run.
[opencl:acc:0] Intel® FPGA Emulation Platform for OpenCL™, Intel® FPGA Emulation Device 1.2 [2021.12.9.0.24_005321]
[opencl:gpu:1] Intel® OpenCL HD Graphics, Intel® UHD Graphics 630 [0x3e92] 3.0 [21.37.20939]
[opencl:cpu:2] Intel® OpenCL, Intel® Core™ i7-8700 CPU @ 3.20GHz 3.0 [2021.12.9.0.24_005321]
[level_zero:gpu:0] Intel® Level-Zero, Intel® UHD Graphics 630 [0x3e92] 1.1 [1.2.20939]
[host:host:0] SYCL host platform, SYCL host device 1.2 [1.2]
Additional information about device selection is available from the DPC++ Language Guide and API Reference.
OpenMP* Device Query and Selection in the Host Code
OpenMP provided a set of APIs for programmers to query and set device for running code on the device. Host code can explicitly select and set a device num. For each offloading region, a programmer can also use a device clause to specify the target device that is to be used for executing the offloading region.
int omp_get_num_procs (void) routine returns the number of processors available to the device
void omp_set_default_device(int device_num) routine controls the default target device
int omp_get_default_device(void) routine returns the default target device
int omp_get_num_devices(void) routine returns the number of non-host devices available for offloading code or data.
int omp_get_device_num(void) routine returns the device number of the device on which the calling thread is executing.
int omp_is_initial_device(int device_num) routine returns true if the current task is executing on the host device; otherwise, it returns false.
int omp_get_initial_device(void) routine returns a device number that represents the host device.
A programmer can use the environment variable LIBOMPTARGET_DEVICETYPE = [ CPU | GPU ] to perform a device type selection. If a specific device type such as CPU or GPU is specified, then it is expected that the specified device type is available in the platform. If such a device is not available, then the runtime system throws an error that the requested device type is not available if the environment variable OMP_TARGET_OFFLOAD=mandatory, otherwise, the execution will have a fallback execution on its initial device. Additional information about device selection is available from the OpenMP 5.1 specification. Details about environment variables are available from GitHub: https://github.com/intel/llvm/blob/sycl/sycl/doc/EnvironmentVariables.md.