We have an application from a customer, which is written in Intel® DPC++ and expected to run on GPU device on Windows* platform, but the log shows the application is run on HOST due to not detecting GPU.
The final binary does not contain GPU kernel that’s necessary to be offloaded to GPU, because the linker used is Windows*’ “xilink.” A DPC++ linker is needed to link the device kernel into the binary.
This application uses CMake and input options to build a generator called “nmake.” Since DPC++ linker (dpcpp-cl) is a driver, replacing linker with dpcpp-cl won't be a direct replacement. Feeding CMake with option or modifying CMakeCache.text doesn’t achieve the effect.
An experiments is done by linking manually of some intermediate/object files that will be erased when CMake finishes building; this requires retaining those temporary and intermediate files.
--debug-trycompile is the option needed for CMake to preserve temporary and intermediate files.
cmake --debug-trycompile .. -G"NMake Makefiles" -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=dpcpp -DOpenMP_CXX_FLAGS=-openmp -DOpenMP_CXX_LIB_NAMES=libiomp5 -DOpenMP_libiomp5_LIBRARY="C:\Program Files (x86)\Intel\oneAPI\compiler\2021.3.0\windows\compiler\lib\intel64_win\libiomp5md.lib" -DGMX_GPU=SYCL -DGMX_BUILD_OWN_FFTW=OFF -DFFTWF_LIBRARY='../fftw-3.3.5-dll64/libfftw3f-3.lib' -DFFTWF_INCLUDE_DIR='../fftw-3.3.5-dll64'
So now many intermediate files can be found under CMakeFiles/CMakeTmp.
The goal is to link with dpcpp-cl. There are many intermediate library/exe/pdb/ilk files; cmTC_898c0, is picked out for this experimental purpose. These file names are random every time “cmake” is invoked.
"C:\Program Files (x86)\Intel\oneAPI\compiler\latest\windows\bin\dpcpp-cl.exe" /nologo -o C:\Users\ayu1\source\repos\gromacs-2021-sycl\build_sycl\CMakeFiles/CMakeTmp/cmTC_898c0.exe kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /link /pdb:C:\Users\ayu1\source\repos\gromacs-2021-sycl\build_sycl\CMakeFiles\CMakeTmp\cmTC_898c0.pdb /version:0.0 /machine:x64 /debug /INCREMENTAL /subsystem:console /MANIFEST /MANIFESTFILE:C:\Users\ayu1\source\repos\gromacs-2021-sycl\build_sycl\CMakeFiles\CMakeTmp\CMakeFiles\cmTC_898c0.dir/intermediate.manifest CMakeFiles\CMakeTmp\CMakeFiles\cmTC_898c0.dir/manifest.res @C:\Users\ayu1\source\repos\gromacs-2021-sycl\build_sycl\CMakeFiles\CMakeTmp\CMakeFiles\cmTC_898c0.dir\objects1.rsp
Correct path also needs to be set, such as the path written in cmTC_898c0.dir\objects1.rsp is modified to relfect where the actual CheckIncludeFile.c.obj file is.
This linking is a success. So now the issue is surley not a compiler or toolchain problem, otherwise the program wouldn’t have been compiled and linked successfully.
One might wonder why the compiler is specified to be "dpcpp" but the linker is xilink instead--CMake by default uses link-lld.exe as the linker, but for this particular application, the build file sets the linker explicitly to xilink.exe. Furthermore, it is also worthwhile to understand a known CMake issue stated in this Developer Guide:
In "Windows" section, it says:
Make/Ninja Generators: The default behavior with CMake, when you use Ninja or CMake generators, does not automatically link DPC++ applications. CMake attempts to link with lld-link.exe, which correctly links an application without errors (if you are explicitly linking to SYCL*), but the application does not run SYCL kernels correctly. To link and run without errors, you must use dpcpp.exe.
Thus, using CMake actually renders the compilation and linking as two different steps, but CMake integration team should work to close this gap.
- Use “--debug-trycompile” as a CMake option to preserve temporary/intermediate files.
- DPC++ compiler should use dpcpp linker, but currently CMake breaks this rule.
- dpcpp-cl is a driver and only takes driver options.
- Windows*’ “xilink” is an Intel® wrapper around the original Windows*’ “link” linker in order to achieve IPO optimization, but it doesn’t link device kernels into the fat binary.
- Only dpcpp linker can link device kernels.