User Guide

  • 2021.4
  • 09/27/2021
  • Public Content
Contents

Analyze Offloaded Code

Use
Intel® Inspector
to analyze applications containing offloaded code and detect code issues before kernels start executing on GPU.
Correctness analysis is more complicated when offloading Data Parallel C++ (DPC++), OpenMP* and Fortran code to an accelerator.
Intel® Inspector
introduces the early interception approach that enables you to intercept some problems before the kernel executes on a target device.

Prepare Your Application

Intel® Inspector
performs dynamic analysis of generated or linked code and inspects third-party libraries, even when source code is not available. Before running the analysis, make sure to configure your application to run on CPU. You can do that in the following ways:
  • Set
    Intel® Inspector
    to automatically configure your application for analysis:
    • In the graphical user interface (GUI), open the
      Project Properties
      dialog box and select the
      CPU
      option in the
      Offload Target
      drop-down menu.
    • In the command line interface (CLI), use the
      -knob offload-target=cpu
      option
  • Configure your application to execute kernels on CPU manually:
    • For DPC++ applications (for details, see SYCL Environment Variables), run the following commands:
      export SYCL_DEVICE_FILTER="opencl:cpu"
      export SYCL_DEVICE_TYPE=CPU
    • For OpenMP kernels, run the following commands:
      export OMP_TARGET_OFFLOAD=MANDATORY
      export LIBOMPTARGET_DEVICETYPE=cpu
Verify that your application executes code you want to analyze without crashes or exceptions.
Make sure to use automatic device selector (
sycl::default_selector
or
sycl::cpu_selector
). If you specify the device selector explicitly, your code executes on the selected device. For details about device selector usage, see oneAPI Programming Guide: Device Selection.

Run Correctness Analysis

Intel® Inspector
enables you to run three predefined types of both Memory and Threading analyses. The higher level you select, the more memory and execution overhead the analysis takes.
Select the minimal working set for application to have reasonable execution time.
Run Analysis Using GUI
To run memory or threading analysis using GUI:
  1. Set up environment
  2. Launch
    Intel® Inspector
    using the
    inspxe-gui
    command
  3. Create a project and specify the target application
  4. In the
    Project Properties
    dialog box, select the
    CPU
    option in the
    Offload Target
    drop-down menu
  5. Choose
    Memory Error analysis
    or
    Threading Error analysis
    in the drop-down menu in the top left corner of the screen
  6. Select a predefined analysis type using a slider
  7. Click the
    Start
    button
Run Analysis Using CLI
To run memory or threading analysis for applications with offloaded code using CLI, run the following command:
inspxe-cl -collect <analysis_type> -knob offload-target=cpu -- <MyApp> [app_args]
where
<analysis_type>
value specifies the predefined analysis type you want to execute. Available values:
  • mi1
    ,
    mi2
    , and
    mi3
    for Memory Error analysis
  • ti1
    ,
    ti2
    , and
    ti3
    for Threading Error analysis
To export results into a
my_problems.xml
file in your current working directory, run the following command:
For details about
inspxe-cl
command line interface, see
inspxe-cl
Actions, Options and Arguments
or run the
inspxe-cl -help
command.

Explore Results

Open the collected results in the GUI and analyze the detected errors. For offloaded code, the list of problems that
Intel® Inspector
can detect in addition to CPU problems is as follows:
Intel® Inspector
enables you to open your source code and highlight problematic code lines.
View examples for a memory problem and a threading problem.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.