Kernel Overview

Developer Guide for Intel® SDK for OpenCL™ Applications 2017

Download PDF

ID 773042

Date 10/22/2018

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

Document Table of Contents x

Developer Guide for Intel® SDK for OpenCL™ Applications 2017

Developer Guide for Intel® SDK for OpenCL™ Applications 2017 x

Legal Information Getting Help and Support Introducing the Intel® SDK for OpenCL™ Applications What's New in This Release Which Version of the Intel® SDK for OpenCL™ Applications Should I Use? Intel® Code Builder for OpenCL™ API Plug-in for Microsoft Visual Studio* Intel® Code Builder for OpenCL™ API Plug-in for Eclipse* Debugging OpenCL™ Kernels on GPU Intel® SDK for OpenCL™ Applications Standalone Version OpenCL™ 2.1 Development Environment Intel® FPGA Emulation Platform for OpenCL™ Getting Started Guide Troubleshooting Intel® SDK for OpenCL™ Applications Issues

Intel® Code Builder for OpenCL™ API Plug-in for Microsoft Visual Studio* x

Code Editing and Building with Microsoft Visual Studio* Kernel Development Framework Debugging OpenCL™ Kernels on CPU Debugging APIs for GPU and CPU Code Analyzer

Code Editing and Building with Microsoft Visual Studio* x

Configuring Microsoft Visual Studio* IDE Converting an Existing Project into an OpenCL™ Project OpenCL™ New Project Wizard Building an OpenCL™ Project Using OpenCL™ Build Properties Selecting a Target OpenCL™ Device Generating and Viewing Assembly Code Generating and Viewing LLVM Code Generating Intermediate Program Binaries with Intel® Code Builder for OpenCL™ API Plug-in Configuring OpenCL™ Build Options

OpenCL™ New Project Wizard x

Creating an Empty OpenCL™ Project for Windows* Creating a New OpenCL™ Project from OpenCL Project Template for Windows*

Kernel Development Framework x

Kernel Development Framework Session Code Builder Build Toolbar Creating a Session Building a Session Configuring a Session Saving, Loading, and Exporting Sessions Removing Sessions Variable Management Executing a Kernel Analyzing the Kernel Generating C++ Host Code From a Session

Creating a Session x

Creating a New Session Creating a Session from an Existing OpenCL™ Code Generating a Session from an OpenCL™ Application

Building a Session x

Building and Compiling an OpenCL™ Program Build Artifacts Intel® Graphics Disassembly Source Mapping

Configuring a Session x

Code Builder Configuration Toolbar Configuring Sessions Configuring Device Options Configuring Build Options Configuring General Options

Variable Management x

Creating Buffer Variables Creating Image Variables Creating Sampler Variables Creating Pipe Variables Selecting Memory Options Editing Variables Viewing Contents of the Variables Copying Variables Removing Variables

Executing a Kernel x

Code Builder Analysis Toolbar Kernel Execution Input Viewing Kernel Execution Results Output Validation Running a Session With a YUV Image Variable

Analyzing the Kernel x

Viewing the Analysis Results

Viewing the Analysis Results x

Kernel Analysis - Session Info Execution View GPU Kernel Analysis View

Debugging OpenCL™ Kernels on CPU x

Enabling Debugging in OpenCL™ Runtime Configuring and Running the Intel® SDK for OpenCL™ Applications - Debugger Plug-in

Debugging APIs for GPU and CPU x

Intel® SDK for OpenCL™ Applications - API Debugger Enabling the API Debugger Configuring the API Debugger Trace View Objects Tree View Properties View Command Queue View Problems View Image View Data View Memory Tracing

Code Analyzer x

Creating and Launching a New Analyze Session Analysis Results Session Info Host Profiling Kernel Overview Kernel Analysis Host-Side Analysis Optimization Tips Revising Code and Rerunning a Host Profiling Session Revising Code and Rerunning a Kernel Profiling Session Configuring the Code Analyzer

Intel® Code Builder for OpenCL™ API Plug-in for Eclipse* x

Configurations and Settings Kernel Development Framework Code Analyzer

Configurations and Settings x

Configuring the Intel® Code Builder for OpenCL™ API Plug-in for Eclipse* Code Builder Configuration Toolbar Configuring Device Options Configuring Build Options Configuring General Options

Kernel Development Framework x

Kernel Development Framework Session Code Builder Build Toolbar Creating a New Session Saving and Loading Sessions Configuring a Session Building a Session Removing Sessions Variable Management in Eclipse* Executing a Kernel Analyzing Kernel Performance Generating C++ Host Code From a Session

Building a Session x

Building and Compiling OpenCL™ Program Build Artifacts Intel® Graphics Disassembly Source Mapping

Variable Management in Eclipse* x

Creating Buffer Variables Creating Image Variables Creating Sampler Variables Creating Pipe Variables Selecting Memory Options Editing the Variables Viewing Contents of the Variables Copying Variables in Eclipse Removing Variables

Executing a Kernel x

Code Builder Analysis Toolbar Kernel Execution Input Viewing Kernel Execution Results Output Validation Running a Session With a YUV Image Variable

Analyzing Kernel Performance x

Viewing the Analysis Results

Viewing the Analysis Results x

Session Info Execution View GPU Kernel Analysis View

Code Analyzer x

Kernel Analysis

Debugging OpenCL™ Kernels on GPU x

Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Windows* Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Linux* Beta

Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Windows* x

Installing Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Windows* Debugging OpenCL™ Kernels

Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Linux* Beta x

Installing Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Linux* Debugging OpenCL™ Kernels

Intel® SDK for OpenCL™ Applications Standalone Version x

Using Intel® SDK for OpenCL™ Applications Standalone Version Kernel Performance Analysis Command Line Interface Building with Intel® SDK for OpenCL™ Applications - Offline Compiler Command-Line Interface

Using Intel® SDK for OpenCL™ Applications Standalone Version x

Building and Compiling Kernels Saving and Loading Code Saving and Loading Session LLVM, SPIR, and Assembly Code View Generating Intermediate Program Binaries Configuring Options Linking Program Binaries Configuring Linkage Options

Kernel Performance Analysis x

Analyzing OpenCL™ Kernel Performance Managing Variables Viewing Analysis Results Deep Kernel Analysis in Kernel Builder

Managing Variables x

Creating Variables Using Structs Choosing Memory Options Editing Variables Viewing Variable Contents Deleting Variables

Creating Variables x

Creating Buffers Creating Images Creating Samplers

Viewing Analysis Results x

Best and Worst Configurations Statistics for Each Configuration Statistics per Iteration Variable Handling

Deep Kernel Analysis in Kernel Builder x

Profiling Kernels for Deep Kernel Analysis Viewing Deep Kernel Analysis Results Viewing Execution Statistics of Deep Kernel Analysis

Troubleshooting Intel® SDK for OpenCL™ Applications Issues x

Troubleshooting the CPU Kernel Debugger Troubleshooting the GPU Kernel Debugger

Troubleshooting the CPU Kernel Debugger x

Message: Protocol error CPU Kernel Debugging Issues

Troubleshooting the GPU Kernel Debugger x

Access Denied RPC Server is Unavailable gdbserver Error Issue with remote get cl file path System Driver Cannot Be Uninstalled Host and Target Systems are Stuck

Developer Guide for Intel® SDK for OpenCL™ Applications 2017

Legal Information

Getting Help and Support

Introducing the Intel® SDK for OpenCL™ Applications

What's New in This Release

Which Version of the Intel® SDK for OpenCL™ Applications Should I Use?

Intel® Code Builder for OpenCL™ API Plug-in for Microsoft Visual Studio*

Code Editing and Building with Microsoft Visual Studio*

Configuring Microsoft Visual Studio* IDE

Converting an Existing Project into an OpenCL™ Project

OpenCL™ New Project Wizard

Creating an Empty OpenCL™ Project for Windows*

Creating a New OpenCL™ Project from OpenCL Project Template for Windows*

Building an OpenCL™ Project

Using OpenCL™ Build Properties

Selecting a Target OpenCL™ Device

Generating and Viewing Assembly Code

Generating and Viewing LLVM Code

Generating Intermediate Program Binaries with Intel® Code Builder for OpenCL™ API Plug-in

Configuring OpenCL™ Build Options

Kernel Development Framework

Kernel Development Framework Session

Code Builder Build Toolbar

Creating a Session

Creating a New Session

Creating a Session from an Existing OpenCL™ Code

Generating a Session from an OpenCL™ Application

Building a Session

Building and Compiling an OpenCL™ Program

Build Artifacts

Intel® Graphics Disassembly Source Mapping

Configuring a Session

Code Builder Configuration Toolbar

Configuring Sessions

Configuring Device Options

Configuring Build Options

Configuring General Options

Saving, Loading, and Exporting Sessions

Removing Sessions

Variable Management

Creating Buffer Variables

Creating Image Variables

Creating Sampler Variables

Creating Pipe Variables

Selecting Memory Options

Editing Variables

Viewing Contents of the Variables

Copying Variables

Removing Variables

Executing a Kernel

Code Builder Analysis Toolbar

Kernel Execution Input

Viewing Kernel Execution Results

Output Validation

Running a Session With a YUV Image Variable

Analyzing the Kernel

Viewing the Analysis Results

Kernel Analysis - Session Info

Execution View

GPU Kernel Analysis View

Generating C++ Host Code From a Session

Debugging OpenCL™ Kernels on CPU

Enabling Debugging in OpenCL™ Runtime

Configuring and Running the Intel® SDK for OpenCL™ Applications - Debugger Plug-in

Debugging APIs for GPU and CPU

Intel® SDK for OpenCL™ Applications - API Debugger

Enabling the API Debugger

Configuring the API Debugger

Trace View

Objects Tree View

Properties View

Command Queue View

Problems View

Image View

Data View

Memory Tracing

Code Analyzer

Creating and Launching a New Analyze Session

Analysis Results

Session Info

Host Profiling

Kernel Overview

Kernel Analysis

Host-Side Analysis Optimization Tips

Revising Code and Rerunning a Host Profiling Session

Revising Code and Rerunning a Kernel Profiling Session

Configuring the Code Analyzer

Intel® Code Builder for OpenCL™ API Plug-in for Eclipse*

Configurations and Settings

Configuring the Intel® Code Builder for OpenCL™ API Plug-in for Eclipse*

Code Builder Configuration Toolbar

Configuring Device Options

Configuring Build Options

Configuring General Options

Kernel Development Framework

Kernel Development Framework Session

Code Builder Build Toolbar

Creating a New Session

Saving and Loading Sessions

Configuring a Session

Building a Session

Building and Compiling OpenCL™ Program

Build Artifacts

Intel® Graphics Disassembly Source Mapping

Removing Sessions

Variable Management in Eclipse*

Creating Buffer Variables

Creating Image Variables

Creating Sampler Variables

Creating Pipe Variables

Selecting Memory Options

Editing the Variables

Viewing Contents of the Variables

Copying Variables in Eclipse

Removing Variables

Executing a Kernel

Code Builder Analysis Toolbar

Kernel Execution Input

Viewing Kernel Execution Results

Output Validation

Running a Session With a YUV Image Variable

Analyzing Kernel Performance

Viewing the Analysis Results

Session Info

Execution View

GPU Kernel Analysis View

Generating C++ Host Code From a Session

Code Analyzer

Kernel Analysis

Debugging OpenCL™ Kernels on GPU

Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Windows*

Installing Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Windows*

Debugging OpenCL™ Kernels

Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Linux* Beta

Installing Intel® SDK for OpenCL™ Applications - GPU Kernel Debugger for Linux*

Debugging OpenCL™ Kernels

Intel® SDK for OpenCL™ Applications Standalone Version

Using Intel® SDK for OpenCL™ Applications Standalone Version

Building and Compiling Kernels

Saving and Loading Code

Saving and Loading Session

LLVM, SPIR, and Assembly Code View

Generating Intermediate Program Binaries

Configuring Options

Linking Program Binaries

Configuring Linkage Options

Kernel Performance Analysis

Analyzing OpenCL™ Kernel Performance

Managing Variables

Creating Variables

Creating Buffers

Creating Images

Creating Samplers

Using Structs

Choosing Memory Options

Editing Variables

Viewing Variable Contents

Deleting Variables

Viewing Analysis Results

Best and Worst Configurations

Statistics for Each Configuration

Statistics per Iteration

Variable Handling

Deep Kernel Analysis in Kernel Builder

Profiling Kernels for Deep Kernel Analysis

Viewing Deep Kernel Analysis Results

Viewing Execution Statistics of Deep Kernel Analysis

Command Line Interface

Building with Intel® SDK for OpenCL™ Applications - Offline Compiler Command-Line Interface

OpenCL™ 2.1 Development Environment

Intel® FPGA Emulation Platform for OpenCL™ Getting Started Guide

Troubleshooting Intel® SDK for OpenCL™ Applications Issues

Troubleshooting the CPU Kernel Debugger

Message: Protocol error

CPU Kernel Debugging Issues

Troubleshooting the GPU Kernel Debugger

Access Denied

RPC Server is Unavailable

gdbserver Error

Issue with remote get cl file path

System Driver Cannot Be Uninstalled

Host and Target Systems are Stuck

Kernel Overview

The Kernel Overview page provides data that can help you optimize your kernel code.

This section includes the API Calls report, that shows every OpenCL kernel that was launched during the program execution.

Kernels with different name, different global work size, or different local work size are considered as a different kernels and presented in a different rows.

Each row shows:

The total, minimum, maximum and average kernel execution time.
EU Active - The normalized sum of all cycles on all cores spent actively executing instructions.
EU Stalled - The normalized sum of all cycles on all cores spent stalled. At least one thread is loaded, but the core is stalled for some reason.
GPU Memory Reads/Writes - Reads/Writes from GPU from/to chip uncore (LLC) and memory. Those are all memory accesses that miss in internal GPU L3 cache and are serviced either from uncore or main memory.
L3 Cache Misses - All read and write misses in GPU L3 cache.
Untyped Memory Reads/Writes - Memory accesses to buffer created with clCreateBuffer
Typed Memory Reads/Writes - Memory accesses to typed buffers, e.g., writes to buffers created with clCreateImage. However, reads from images are counted by Sampler accesses and Texture Read.
SLM Reads/Writes Memory accesses to Shared Local Memory

Click the + button on the left of any kernel name to expand its row. The expanded area presents additional information, including the latency, return value, command queue, context and timing data of each time this kernel was executed during the program execution.

Level Two Title

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Developer Guide for Intel® SDK for OpenCL™ Applications 2017

Kernel Overview