Intel® DPC++ Compatibility Tool

Migrate Your CUDA* Code to Portable C++ with SYCL* Multiarchitecture Code

The DPC++ Compatibility Tool was deprecated after the 2025.3 release. For any issues or requests, please submit an issue to the SYCLomatic open source repository. We will assess the impact and provide support. We also welcome your contributions to the SYCLomatic open-source community.

Efficient Code Migration

The Intel® DPC++ Compatibility Tool assists in migrating your existing CUDA* code to SYCL* code.
DPC++ is based on ISO C++ and incorporates standard SYCL and community extensions to simplify data parallel programming.

Migrate from CUDA* to C++ with SYCL*

CUDA* to SYCL* Application Catalog

How It Works

The tool ports both CUDA* language kernels and library API calls.
Typically, 90%-95% of CUDA* code automatically migrates to SYCL*¹.
Inline comments help you finish writing and tuning your code.

Intel® DPC++ Compatibility Tool Guide

Intel® oneAPI DPC++/C++ Compiler

1 An Intel® estimate as of March 2023, which is based on measurements from a set of 85 HPC benchmarks and samples, with examples like Rodinia, SHOC, and Pennant. Results may vary.

Download the Stand-Alone Version

A stand-alone download of the Intel® DPC++ Compatibility Tool is available. You can download binaries from Intel or choose your preferred repository.

Download

Help the Intel® DPC++ Compatibility Tool Evolve

This tool supports the oneAPI industry standards initiative. You are welcome to participate.

oneAPI Specification

Open Source SYCLomatic (GitHub*)

Code Migration: Before & After

Source CUDA Code

The Intel DPC++ Compatibility Tool migrates software programs implemented with current and previous versions of CUDA. For details, see the release notes.

#include <cuda.h>
#include <stdio.h>

const int vector_size = 256;

__global__ void SimpleAddKernel(float *A, int offset) 
{
  A[threadIdx.x] = threadIdx.x + offset;
 
}int main() 
{
  float *d_A;
  int offset = 10000;

  cudaMalloc( &d_A, vector_size * sizeof( float ) );
  SimpleAddKernel<<<1, vector_size>>>(d_A, offset);

  float result[vector_size] = { };
  cudaMemcpy(result, d_A, vector_size*sizeof(float), cudaMemcpyDeviceToHost);

  cudaFree( d_A );
   
  for (int i = 0; i < vector_size; ++i) {
    if (i % 8 == 0) printf( "\n" );
    printf( "%.1f ", result[i] );
  }

  return 0;
}

Migrated Code

This resulting code is typical of what you can expect to see after code is ported. In most cases, code edits and optimizations will be required to complete the code migration.

#include <CL/sycl.hpp>
#include <dpct/dpct.hpp>
#include <stdio.h>

const int vector_size = 256;

void SimpleAddKernel(float *A, int offset, sycl::nd_item<3> item_ct1)
{ 
  A[item_ct1.get_local_id(2)] = item_ct1.get_local_id(2) + offset;
 
}int main()
{
  dpct::device_ext &dev_ct1 = dpct::get_current_device();
  sycl::queue &q_ct1 = dev_ct1.default_queue();
  float *d_A;
  int offset = 10000;

  d_A = sycl::malloc_device<float>(vector_size, q_ct1);
  q_ct1.submit([&](sycl::handler &cgh) {
    cgh.parallel_for(sycl::nd_range(sycl::range(1, 1, vector_size),
                                    sycl::range(1, 1, vector_size)),
                     [=](sycl::nd_item<3> item_ct1) {
                       SimpleAddKernel(d_A, offset, item_ct1);
                     });
  });

  float result[vector_size] = { };
  q_ct1.memcpy(result, d_A, vector_size * sizeof(float)).wait();

  sycl::free(d_A, q_ct1);

  for (int i = 0; i < vector_size; ++i) {
    if (i % 8 == 0) printf( "\n" );
    printf( "%.1f ", result[i] );
  }

  return 0;
}

Get Started

Download

Install and configure the Intel® DPC++ Compatibility Tool, which is part of the Intel® oneAPI Toolkit.

Get the Intel® oneAPI Toolkit

System Requirements

Try It Out

See how the migration process works using an introductory code sample.

Get Started Guide

Learn More

Access additional samples, tutorials, and training resources.

DPC++

Intel® oneAPI DPC++/C++ Compiler

Show more Show less

1 An Intel estimate as of September 2024, which is based on measurements from a set of 100 HPC benchmarks, AI applications, and samples, with examples like GROMACS, llama.cpp, and SqueezeLLM. Results may vary.

Documentation & Code Samples

Documentation

View All Documentation

Success Stories

Code Samples

Get Started

Vector Add
This Hello World sample demonstrates how to migrate a simple program from CUDA* to code that is compliant with SYCL*. Use it to verify that your development environment is set up correctly for the migration.

Needleman Wunsch
This sample represents a typical example of migrating a working Make and CMake* project from CUDA* to SYCL*. The code implements the Needleman-Wunsch algorithm and is based on Rodinia, a set of benchmarks for heterogeneous computing.

Code Optimization

Concurrent Kernels
Implement this guided sample by migrating the original CUDA* based code to SYCL for offloading computations to a GPU or CPU. Learn how to optimize and improve processing time using SYCL* queues for concurrent running of several kernels on a GPU.

HSOptical Flow
This sample implements the Horn-Schnuck method for estimating optical flow. Learn how a partial differential equation (PDE) solver can be accelerated through GPU offload.

Quasi-random Generator
Implement this guided sample by migrating the original CUDA*-based code to SYCL for offloading computations to a GPU or CPU. The sample demonstrates migrating the constant memory feature in CUDA*.

View oneAPI Samples Catalog

How to work with code samples:

Use a command-line interface: Windows* | Linux*
Use an IDE: Windows | Linux

Training

How to Migrate CUDA* Code to C++ with SYCL

CUDA* to SYCL Automatic Migration Tool [5:55]
A Detailed Migration Flow
Tips and Tricks for Migrating CUDA* to SYCL* [59:43]

Hands-on Learning

Self-Guided CUDA* to SYCL* Migration Tutorial

Future-Proof Code on Modern Accelerator Processors

Free Your Software from Vendor Lock-in Using SYCL* and oneAPI

🗐 View All Resources

🗗 Training & Events Calendar

Specifications

Operating system for development:

Linux*
Windows*

Software tool requirements:

CUDA* header files
Eclipse* (optional)
Visual Studio* (optional)

For details, see the system requirements.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in