Migrating the Jacobi Iterative Method from CUDA to SYCL

Overview

The Jacobi Cuda Graphs sample demonstrates the number of iterations needed to solve a system of Linear Equations using the Jacobi Iterative Method. This sample includes the migration of CUDA* Graph explicit API calls such as cudaGraphCreate(), cudaGraphAddMemcpyNode(), cudaGraphInstantiate(), to SYCL*.

In doing so it uses the Taskflow parallel programming model, which manages a task dependency graph. The sample is implemented using SYCL by migrating code from original CUDA source code and offloading computations to a CPU, GPU, or accelerator.

Area	Description
What you will learn	Migrate and optimize Jacobi CUDA Graphs sample from CUDA to SYCL.
Time to complete	15 minutes
Category	Concepts and Functionality

Key Implementation Details

The Jacobi CUDA Graphs computations happen inside a two-kernel Jacobi Method and Final Error Kernels. Element reduction is performed to obtain the final error or sum value.

In this sample, the vectors are loaded into shared memory for faster access, and thread blocks are partitioned into tiles. Then, a reduction of input data is performed in each of the partitioned tiles using sub-group primitives. These intermediate results are added to a final sum variable via an atomic add operation.

The computation kernels are either scheduled using 2 alternative types of function calls:

Host function JacobiMethodGpuCudaGraphExecKernelSetParams(), which uses explicit CUDA Graph APIs
Host function JacobiMethodGpu(), which uses regular CUDA APIs to launch kernels.

Original CUDA source files: JacobiCudaGraphs.

Migrated SYCL source files including step by step instructions: guided_JacobiCudaGraphs_SYCLmigration.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Migrating the Jacobi CUDA Graphs from CUDA* to SYCL*

Get the Latest on All Things CODE

Overview

Key Implementation Details

References

Product and Performance Information