High-Performance GPU Acceleration—Part 2: Offload Performance
High-Performance GPU Acceleration—Part 2: Offload Performance
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Developers who deploy applications across CPUs to GPUs are often challenged to find the best methods for analyzing and optimizing offload performance.
In Part 2 of this webinar series, technical consulting engineer Kevin O’Leary focuses on tuning software for optimal performance once hardware is available. He uses Intel® VTune™ Profiler, a performance analyzer that takes the guesswork out of cross-architecture improvements. (Part 1 of this series focuses on designing software for efficient offload even before hardware is available.)
Using a sample application written in Data Parallel C++ (DPC++), Kevin demonstrates how Intel VTune Profiler can:
- Profile DPC++, OpenMP* offload, and code running on host and GPU processors
- Collect the right data and turn it into rich, interpretable analysis
- Identify the hot spots in your compute kernels, including which are key areas for optimization
- Show how the GPU resources are being used and locate hardware bottlenecks
Featured Software
- Get Intel VTune Profiler as part of the Intel® oneAPI Base Toolkit—a foundational set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.
- Get the stand-alone version of Intel VTune Profiler.
Resources
- Sign up for an Intel® Developer Cloud account—a free development sandbox with access to the latest Intel hardware and oneAPI software.
- Explore oneAPI including developer opportunities and benefits.
- Subscribe to Code Together—an interview series that explores the challenges at the forefront of cross-architecture development. Each biweekly episode features industry VIPs who are blazing new trails through today's data-centric world. Available wherever you get your podcasts.
Develop high-performance, data-centric applications for CPUs, GPUs, and FPGAs with this core set of tools, libraries, and frameworks including LLVM*-based compilers.
You May Also Like
Related Article