Migrating and optimizing the convolution separable from CUDA* to SYCL*

ID 779520
Updated 12/6/2024
Version Latest
Public

author-image

By

Overview

The convolution separable is a process in which a single convolution can be divided into two or more convolutions to produce the same output. The original CUDA* source code is migrated to SYCL for portability across GPUs from multiple vendors.

 

Area

Description

What you will learn

Migrate convolution separable sample from CUDA to SYCL

Time to complete

15 minutes

Category Code Optimization

 

Key Implementation Details

This sample implements a separable convolution filter of a 2D image with an arbitrary kernel. There are two functions in the code named convolutionRowsGPU and convolutionColumnsGPU in which the kernel functions (convolutionRowsKernel & convolutionColumnsKernel) are called where the loading of the input data and computations are performed. We validate the results with reference CPU separable convolution implementation by calculating the relative L2 norm.

For more information on the convolutionSeparable SYCL migrated sample and build details on CPU and GPU, refer here

Original CUDA source files: convolutionSeparable.

Migrated SYCL source files including step by step instructions: guided_convolutionSeparable_SYCLmigration.

 

References