Set Up the Intercept Layer for OpenCL* Applications
- Download Intercept Layer for OpenCL Applications version 2.2.1 or later from GitHub* at the following URL:
- Build the Intercept Layer according to the instructions provided in How to Build the Intercept Layer for OpenCL* Applications.
- Ensure that you have setENABLE_CLILOADER=1when runningcmakecommand. For example, runcmake -DENABLE_CLILOADER=1 ...
- Run themakecommand in the build directory. This step builds thecliloaderloader utility.Thecliloaderexecutable should now exist in the<path to opencl-intercept-layer-master download>/<build dir>/cliloader/directory.
- Add the directory to yourPATHenvironment variable if you want to run multiple designs usingcliloader.You can now pass your executables tocliloaderto run them with the intercept layer. For details about thecliloaderloader utility, see cliloader: A Intercept Layer for OpenCL* Applications Loader.
- Setcliloaderand other Intercept Layer options.If you run multiple designs with the same options, set up aclintercept.conffile in your home directory. You can also set the options as environment variables by prefixing the option name withCLI_. For example, theDllNameoption can be set through theCLI_DllNameenvironment variable. For a list of options, seeControlsin How to Use the Intercept Layer for OpenCL Applications.Option/VariableDescriptionDllName=$CMPLR_ROOT/linux/lib/libOpenCL.soThe intercept layer must know wherelibOpenCL.sofile from the original oneAPI build is.DevicePerformanceTiming=1andDevicePerformanceTimelineLogging=1These options print out runtime timeline information in the output of the executable run.ChromePerformanceTiming=1,ChromeCallLogging=1,ChromePerformanceTimingInStages=1These variables set up the chrome tracer output and ensure the output has Queued, Submitted, and Execution stages.
Device Timeline for clEnqueueWriteBuffer (enqueue 1) = 63267241140401 ns (queued), 63267241149579 ns (submit), 63267241194205 ns (start), 63267242905519 ns (end)
- Numbers that contain a decimal point.
- The part of the number before the decimal point orders the calls approximately by start time.
- The part of the number after the decimal point represents the queue number the call was made in.
- Numbers that do not contain a decimal point. These numbers represent the thread ID of the thread being run on in the operating system.
- Blue during the queued stage.
- Yellow during the submitted stage.
- Orange for the execution stage.