High-Performance GPU Acceleration—Part 2: Offload Performance

Subscribe Now

Stay in the know on all things CODE. Updates are delivered to your inbox.

Overview

Developers who deploy applications across CPUs to GPUs are often challenged to find the best methods for analyzing and optimizing offload performance.

In Part 2 of this webinar series, technical consulting engineer Kevin O’Leary focuses on tuning software for optimal performance once hardware is available. He uses Intel® VTune™ Profiler, a performance analyzer that takes the guesswork out of cross-architecture improvements. (Part 1 of this series focuses on designing software for efficient offload even before hardware is available.)

Using a sample application written in Data Parallel C++ (DPC++), Kevin demonstrates how Intel VTune Profiler can:

Profile DPC++, OpenMP* offload, and code running on host and GPU processors
Collect the right data and turn it into rich, interpretable analysis
Identify the hot spots in your compute kernels, including which are key areas for optimization
Show how the GPU resources are being used and locate hardware bottlenecks

Featured Software

Get Intel VTune Profiler as part of the Intel® oneAPI Base Toolkit—a foundational set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.
Get the stand-alone version of Intel VTune Profiler.

Resources

Sign up for an Intel® Developer Cloud account—a free development sandbox with access to the latest Intel hardware and oneAPI software.
Explore oneAPI including developer opportunities and benefits.
Subscribe to Code Together—an interview series that explores the challenges at the forefront of cross-architecture development. Each biweekly episode features industry VIPs who are blazing new trails through today's data-centric world. Available wherever you get your podcasts.

Jump to:

Intel® oneAPI Base Toolkit

Develop high-performance, data-centric applications for CPUs, GPUs, and FPGAs with this core set of tools, libraries, and frameworks including LLVM*-based compilers.

Get It Now

See All Tools

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

High-Performance GPU Acceleration—Part 2: Offload Performance