Intel® oneAPI Deep Neural Network Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
OpenCL interoperability API
Overview
API extensions to interact with the underlying OpenCL run-time. More…
// namespaces
namespace dnnl::graph::ocl_interop;
// typedefs
typedef void* (*dnnl_graph_ocl_allocate_f)(
    size_t size,
    size_t alignment,
    cl_device_id device,
    cl_context context
    );
typedef void (*dnnl_graph_ocl_deallocate_f)(
    void *buf,
    cl_device_id device,
    cl_context context,
    cl_event event
    );
// global functions
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
    dnnl_graph_allocator_t* allocator,
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    );
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc
    );
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc,
    size_t size,
    const uint8_t* cache_blob
    );
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
    const_dnnl_graph_compiled_partition_t compiled_partition,
    dnnl_stream_t stream,
    size_t num_inputs,
    const_dnnl_graph_tensor_t* inputs,
    size_t num_outputs,
    const_dnnl_graph_tensor_t* outputs,
    const cl_event* deps,
    int ndeps,
    cl_event* return_event
    ); 
  Detailed Documentation
API extensions to interact with the underlying OpenCL run-time.
Typedefs
typedef void* (*dnnl_graph_ocl_allocate_f)(
    size_t size,
    size_t alignment,
    cl_device_id device,
    cl_context context
    ) 
   Allocation call-back function interface for OpenCL.
OpenCL allocator should be used for OpenCL GPU runtime. The call-back should return a USM device memory pointer.
Parameters:
size  |  
        Memory size in bytes for requested allocation  |  
       
alignment  |  
        The minimum alignment in bytes for the requested allocation  |  
       
device  |  
        A valid OpenCL device used to allocate  |  
       
context  |  
        A valid OpenCL context used to allocate  |  
       
Returns:
The memory address of the requested USM allocation.
typedef void (*dnnl_graph_ocl_deallocate_f)(
    void *buf,
    cl_device_id device,
    cl_context context,
    cl_event event
    ) 
   Deallocation call-back function interface for OpenCL.
OpenCL allocator should be used for OpenCL runtime. The call-back should deallocate a USM device memory returned by dnnl_graph_ocl_allocate_f. The event should be completed before deallocate the USM.
Parameters:
buf  |  
        The USM allocation to be released  |  
       
device  |  
        A valid OpenCL device the USM associated with  |  
       
context  |  
        A valid OpenCL context used to free the USM allocation  |  
       
event  |  
        A event which the USM deallocation depends on  |  
       
Global Functions
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_allocator_create(
    dnnl_graph_allocator_t* allocator,
    dnnl_graph_ocl_allocate_f ocl_malloc,
    dnnl_graph_ocl_deallocate_f ocl_free
    ) 
   Creates an allocator with the given allocation and deallocation call-back function pointers.
Parameters:
allocator  |  
        Output allocator  |  
       
ocl_malloc  |  
        A pointer to OpenCL malloc function  |  
       
ocl_free  |  
        A pointer to OpenCL free function  |  
       
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc
    ) 
   This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create( dnnl_engine_t *engine, cl_device_id device, cl_context context);.
Parameters:
engine  |  
        Output engine.  |  
       
device  |  
        Underlying OpenCL device to use for the engine.  |  
       
context  |  
        Underlying OpenCL context to use for the engine.  |  
       
alloc  |  
        Underlying allocator to use for the engine.  |  
       
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_make_engine_from_cache_blob_with_allocator(
    dnnl_engine_t* engine,
    cl_device_id device,
    cl_context context,
    const_dnnl_graph_allocator_t alloc,
    size_t size,
    const uint8_t* cache_blob
    ) 
   This API is a supplement for existing oneDNN engine API: dnnl_status_t DNNL_API dnnl_ocl_interop_engine_create_from_cache_blob( dnnl_engine_t *engine, cl_device_id device, cl_context context, size_t size, const uint8_t *cache_blob);.
Parameters:
engine  |  
        Output engine.  |  
       
device  |  
        The OpenCL device that this engine will encapsulate.  |  
       
context  |  
        The OpenCL context (containing the device) that this engine will use for all operations.  |  
       
alloc  |  
        Underlying allocator to use for the engine.  |  
       
size  |  
        Size of the cache blob in bytes.  |  
       
cache_blob  |  
        Cache blob of size size.  |  
       
Returns:
dnnl_success on success and a status describing the error otherwise.
dnnl_status_t DNNL_API dnnl_graph_ocl_interop_compiled_partition_execute(
    const_dnnl_graph_compiled_partition_t compiled_partition,
    dnnl_stream_t stream,
    size_t num_inputs,
    const_dnnl_graph_tensor_t* inputs,
    size_t num_outputs,
    const_dnnl_graph_tensor_t* outputs,
    const cl_event* deps,
    int ndeps,
    cl_event* return_event
    ) 
   Execute a compiled partition with OpenCL runtime.
Parameters:
compiled_partition  |  
        The handle of target compiled_partition.  |  
       
stream  |  
        The stream used for execution  |  
       
num_inputs  |  
        The number of input tensors  |  
       
inputs  |  
        A list of input tensors  |  
       
num_outputs  |  
        The number of output tensors  |  
       
outputs  |  
        A non-empty list of output tensors  |  
       
deps  |  
        Optional handle of list with cl_event dependencies.  |  
       
ndeps  |  
        Number of dependencies.  |  
       
return_event  |  
        The handle of cl_event.  |  
       
Returns:
dnnl_success on success and a status describing the error otherwise.