Parallelization
Parallelism is essential to effective use of accelerators because they contain
many independent processing elements that are capable of executing code in
parallel. There are three ways to develop parallel code.
Use a Parallel Programming Language or API
There are many parallel programming languages and APIs that can be used to
express parallelism. oneAPI supports parallel program development through the
Data Parallel C++ (DPC++) language. oneAPI also has a number of code generation
tools to convert these programs into binaries that can be executed on different
accelerators. The usual workflow is that a user starts with a serial program,
identifies the parts of the code that take a long time to execute (referred to
as hotspots), and converts them into parallel kernels that can be offloaded to
an accelerator for execution.
Parallelizing Compilers
Directive-based approaches like OpenMP* are another way to develop parallel
programs. In a directive-based approach, the programmer provides hints to the
compiler about parallelism without modifying the code explicitly. This approach
is easier than developing a parallel program from first principles.
Parallel Libraries
oneAPI includes a number of libraries like oneTBB, oneMKL, oneDNN, and oneVPL
that provide highly-optimized versions of common computational operations run
across a variety of accelerator architectures. Depending on the needs of the
application, a user can directly call the functions from these libraries and
get efficient implementations of these for the underlying architecture. This is
the easiest approach to developing parallel programs, provided the library
contains the required functions. For example, machine learning applications can
take advantage of the optimized primitives in oneDNN. These libraries have been
thoroughly tested for both correctness and performance, which makes programs
more reliable when using them.