Optimize Applications with Cache-Aware Roofline Model: Part 2


Part two of this tutorial continues to review the Cache-aware Roofline Model (CARM) and its basic principles when modelling the performance upper-bounds of Intel CPU and GPU devices. It also features CARM implementation in Intel® Advisor and demonstrates how you can use it to drive the application optimization.

This demonstration relies on epistasis detection as a case-study, which is an important application in bioinformatics. For both Intel CPUs and GPUs, see how CARM can be used to detect execution bottlenecks and provide useful hints on which type of optimizations to apply in order to fully exploit device capabilities. The guidelines provided by CARM were fundamental to achieve the speedups of more than 20x when compared to the baseline code.

Learning objectives:

  • In­-depth understanding of the Cache-aware Roofline model, and its construction and interpretation methodology.
  • Conduct Roofline analysis of CPU and GPU applications using Intel® oneAPI tools.
  • Demonstrate how to use the Roofline model to guide and evaluate application optimization efforts.
  • Showcase the successful use of Roofline automation when optimizing a real-world bioinfomatics application on both CPU and GPU-accelerated systems.



Aleksandar Ilic is an assistant professor at the Instituto Superior Técnico (IST), Universidade de Lisboa, and a senior researcher of the INESC-ID, Lisbon, Portugal. He has contributed to more than 50 international journal and conference publications, and received several Excellence in Teaching awards. Alongside his teaching experience, he has organized and participated in more than 20 roofline-related tutorials, invited talks, and seminars held at different scientific events, such as SC, ISC-HPC, and PACT. The integration of his scientific contribution (Cache-aware Roofline Model) in industry software tools (Intel Advisor) received the HiPEAC Tech Transfer award for 2017. His research interests include high-performance and energy-efficient computing and modeling of parallel heterogeneous systems.

Diogo Marques is currently pursuing his PhD in Electrical and Computer Engineering at the Instituto Superior Técnico (IST), Universidade de Lisboa, Lisbon, Portugal. He is also a member of the HPCAS research group at Instituto de Engenharia de Sistemas e Computadores R&D (INESC-ID). His current research interests include insightful modeling of multi-core processors and heterogeneous systems. His work contributed to improving the accuracy and insightfulness of roofline modeling based on Cache-aware Roofline Model by proposing the memory impact metrics and roof scaling methodology presented in the Intel® Advisor framework.

Rafael Campos obtained his MS degree in Electrical and Computer Engineering from Instituto Superior Técnico (IST), Universidade de Lisboa in 2019. He is currently a young researcher at Instituto de Engenharia de Sistemas e Computadores R&D (INESC-ID), as part of the HPCAS group. His main interests are performance modeling of heterogeneous systems and GPUs. His work includes performance optimization of bioinformatics applications and roofline modeling of high-performance heterogeneous CPU and GPU systems.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.