OpenCL interoperability API

Intel® oneAPI Deep Neural Network Developer Guide and Reference

Download PDF

ID 768875

Date 3/31/2025

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

OpenCL interoperability API

Overview

API extensions to interact with the underlying OpenCL run-time. More…

// namespaces

namespace dnnl::graph::ocl_interop;

// typedefs

typedef void* (*dnnl_graph_ocl_allocate_f)(
    size_t size,
    size_t alignment,
    cl_device_id device,
    cl_context context
    );

typedef void (*dnnl_graph_ocl_deallocate_f)(
    void *buf,
    cl_device_id device,
    cl_context context,
    cl_event event
    );

// global functions

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
    dnnl_graph_allocator_t* allocator,
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    );

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc
    );

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc,
    size_t size,
    const uint8_t* cache_blob
    );

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
    const_dnnl_graph_compiled_partition_t compiled_partition,
    dnnl_stream_t stream,
    size_t num_inputs,
    const_dnnl_graph_tensor_t* inputs,
    size_t num_outputs,
    const_dnnl_graph_tensor_t* outputs,
    const cl_event* deps,
    int ndeps,
    cl_event* return_event
    );

Detailed Documentation

API extensions to interact with the underlying OpenCL run-time.

Typedefs

typedef void* (*dnnl_graph_ocl_allocate_f)(
    size_t size,
    size_t alignment,
    cl_device_id device,
    cl_context context
    )

Allocation call-back function interface for OpenCL.

OpenCL allocator should be used for OpenCL GPU runtime. The call-back should return a USM device memory pointer.

Parameters:

size	Memory size in bytes for requested allocation
alignment	The minimum alignment in bytes for the requested allocation
device	A valid OpenCL device used to allocate
context	A valid OpenCL context used to allocate

Returns:

The memory address of the requested USM allocation.

typedef void (*dnnl_graph_ocl_deallocate_f)(
    void *buf,
    cl_device_id device,
    cl_context context,
    cl_event event
    )

Deallocation call-back function interface for OpenCL.

OpenCL allocator should be used for OpenCL runtime. The call-back should deallocate a USM device memory returned by dnnl_graph_ocl_allocate_f. The event should be completed before deallocate the USM.

Parameters:

buf	The USM allocation to be released
device	A valid OpenCL device the USM associated with
context	A valid OpenCL context used to free the USM allocation
event	A event which the USM deallocation depends on

Global Functions

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
    dnnl_graph_allocator_t* allocator,
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    )

Creates an allocator with the given allocation and deallocation call-back function pointers.

Parameters:

allocator	Output allocator
ocl_malloc	A pointer to OpenCL malloc function
ocl_free	A pointer to OpenCL free function

Returns:

dnnl_success on success and a status describing the error otherwise.

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc
    )

This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create( dnnl_engine_t *engine, cl_device_id device, cl_context context);.

Parameters:

engine	Output engine.
device	Underlying OpenCL device to use for the engine.
context	Underlying OpenCL context to use for the engine.
alloc	Underlying allocator to use for the engine.

Returns:

dnnl_success on success and a status describing the error otherwise.

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc,
    size_t size,
    const uint8_t* cache_blob
    )

This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create_from_cache_blob( dnnl_engine_t *engine, cl_device_id device, cl_context context, size_t size, const uint8_t *cache_blob);.

Parameters:

engine	Output engine.
device	The OpenCL device that this engine will encapsulate.
context	The OpenCL context (containing the device) that this engine will use for all operations.
alloc	Underlying allocator to use for the engine.
size	Size of the cache blob in bytes.
cache_blob	Cache blob of size `size`.

Returns:

dnnl_success on success and a status describing the error otherwise.

dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
    const_dnnl_graph_compiled_partition_t compiled_partition,
    dnnl_stream_t stream,
    size_t num_inputs,
    const_dnnl_graph_tensor_t* inputs,
    size_t num_outputs,
    const_dnnl_graph_tensor_t* outputs,
    const cl_event* deps,
    int ndeps,
    cl_event* return_event
    )

Execute a compiled partition with OpenCL runtime.

Parameters:

compiled_partition	The handle of target compiled_partition.
stream	The stream used for execution
num_inputs	The number of input tensors
inputs	A list of input tensors
num_outputs	The number of output tensors
outputs	A non-empty list of output tensors
deps	Optional handle of list with `cl_event` dependencies.
ndeps	Number of dependencies.
return_event	The handle of cl_event.

Returns:

dnnl_success on success and a status describing the error otherwise.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Deep Neural Network Developer Guide and Reference

OpenCL interoperability API

Overview

Detailed Documentation