This course uses oneAPI and Data Parallel C++ to demonstrate a method to achieve performant, portable code across several different platforms available on Intel® Developer Cloud.
Developers of high-performance computing applications are faced with an increasingly diverse number of computing platforms that feature multiple generations of CPUs, GPUs, FPGAs, and other accelerators. Developing code that is performant and portable across a diverse set of platforms can be expensive and time-consuming to achieve the best result.
Who is this for?
This course is designed for developers who are familiar with SYCL* and who develop code that is expected to perform well in a heterogeneous environment. For a primer on SYCL, take the Essentials of SYCL course.
What will I be able to do?
You can apply the following examples and techniques to your own algorithms:
Explore general matrix multiply (GEMM) algorithm examples using DPC++.
Use several techniques to measure the effectiveness of applications across platforms.
Use timer functions inside applications to measure kernel and compute times.
Take kernel and compute time measurements to compute relative efficiency for the best implementation.
Use the Roofline analysis and Intel® VTune™ Profiler to measure performance across platforms.
Get hands-on practice with code samples in Jupyter* Notebooks running live on Intel Developer Cloud.