A preview is not available for this record, please engage by choosing from the available options ‘download’ or ‘view’ to engage with the material
Description
This white paper offers a comprehensive guide on using Intel® VTune™ and Intel® Advisor tools to diagnose and enhance DPC++ based GPU kernel implementations. Focusing on the CopyPipeline image processing algorithm, which involves multiple kernel invocations, the paper delves deep into the process of setting up these Intel tools for performance data gathering. The derived insights shed light on performance bottlenecks and pave the way for actionable recommendations. While the CopyPipeline is the primary case study, the methodologies and suggestions presented are generic and can be applied across a wide range of GPU-based programs. Through this work, readers are equipped with best practices to harness the full potential of their GPU codes, ensuring efficient data and compute operations.