Skip To Main Content
Intel logo - Return to the home page
My Tools

Select Your Language

  • Bahasa Indonesia
  • Deutsch
  • English
  • Español
  • Français
  • Português
  • Tiếng Việt
  • ไทย
  • 한국어
  • 日本語
  • 简体中文
  • 繁體中文
Sign In to access restricted content

Using Intel.com Search

You can easily search the entire Intel.com site in several ways.

  • Brand Name: Core i9
  • Document Number: 123456
  • Code Name: Alder Lake
  • Special Operators: “Ice Lake”, Ice AND Lake, Ice OR Lake, Ice*

Quick Links

You can also try the quick links below to see results for most popular searches.

  • Product Information
  • Support
  • Drivers & Software

Recent Searches

Sign In to access restricted content

Advanced Search

Only search in

Sign in to access restricted content.

The browser version you are using is not recommended for this site.
Please consider upgrading to the latest version of your browser by clicking one of the following links.

  • Safari
  • Chrome
  • Edge
  • Firefox

Profile Heterogeneous Computing Performance with Intel® VTune™ Profiler

@IntelDevTools


Subscribe Now

Stay in the know on all things CODE. Updates are delivered to your inbox.

Sign Up

Overview

Programming of heterogeneous platforms requires a deep understanding of system architecture on all levels, which helps application design to take advantage of the best data and work decomposition between CPUs and accelerating hardware like GPUs. However, in many cases the applications are being converted from a conventional CPU programming language (like C++) or from an accelerator-friendly but still low-level language (like OpenCL™ code). The main problem is to determine which part of the application benefits from being offloaded to a GPU. Another problem is to estimate how much performance increase one might gain due to the acceleration in the particular GPU device. Each platform has its unique limitations that affect the performance of offloaded computing tasks, for example: data transfer tax, task initialization overhead, memory latency, and bandwidth limitations. To take into account these constraints, software developers need tools to collect the right information and produce recommendations to make the best design and optimization decisions.

This presentation introduces two new GPU performance analysis types in Intel® VTune™ Profiler, and a methodology of heterogeneous applications performance profiling supported by the analyses. Intel VTune Profiler is an established tool for performance characterization on CPUs. It includes GPU offload analysis and GPU hot spot analysis of applications, written on most offloading models with OpenCL code, SYCL* (Data Parallel C++), and OpenMP* Offload.

 

Vladimir Tsmbal

Senior technical consulting engineer, Intel Corporation

Vladimir specializes in teaching customers how to use various Intel® Software Development Tools to develop, tune, and optimize their parallel applications on Intel architecture. In particular, his focus is on the Intel® Parallel Studio XE product suite and the analysis tools it contains, including Intel VTune Profiler (which he helped develop), Intel® Advisor, and Intel® Inspector.

Prior to joining Intel in 2005, Vladimir worked as a research assistant, and developed hardware graphics accelerators and software and hardware systems for medical diagnostics. He holds a PhD in mathematics and computer science from Taganrog State University of Radio Engineering, Russia.

Jump to:

You May Also Like
 

Intel® VTune™ Profiler

Find and fix performance bottlenecks and optimize application and system performance and system configuration for HPC, cloud, IoT, media, storage, and more.

 

Get It Now

 

See All Tools

 

   

You May Also Like

Related Article

Optimize LLVM Code Generation for Data Analytics Using Vectorization

  • Company Overview
  • Contact Intel
  • Newsroom
  • Investors
  • Careers
  • Corporate Responsibility
  • Diversity & Inclusion
  • Public Policy
  • © Intel Corporation
  • Terms of Use
  • *Trademarks
  • Cookies
  • Privacy
  • Supply Chain Transparency
  • Site Map
  • Do Not Share My Personal Information

Intel technologies may require enabled hardware, software or service activation. // No product or component can be absolutely secure. // Your costs and results may vary. // Performance varies by use, configuration and other factors. // See our complete legal Notices and Disclaimers. // Intel is committed to respecting human rights and avoiding complicity in human rights abuses. See Intel’s Global Human Rights Principles. Intel’s products and software are intended only to be used in applications that do not cause or contribute to a violation of an internationally recognized human right.

Intel Footer Logo