Single op partition on CPU

Intel® oneAPI Deep Neural Network Developer Guide and Reference

Download PDF

ID 768875

Date 3/31/2025

Version

Public

A newer version of this document is available. Customers should click here to go to the newest version.

Single op partition on CPU

This is an example to demonstrate how to build a simple op graph and run it on CPU.

Example code: cpu_single_op_partition.cpp

Some key take-aways included in this example:

how to build a single-op partition quickly
how to create an engine, allocator and stream
how to compile a partition
how to execute a compiled partition

Some assumptions in this example:

Only workflow is demonstrated without checking correctness
Unsupported partitions should be handled by users themselves

Public headers

To start using oneDNN Graph, we must include the dnnl_graph.hpp header file in the application. All the C++ APIs reside in namespace dnnl::graph.

#include <iostream>
#include <memory>
#include <vector>
#include <unordered_map>
#include <unordered_set>

#include <assert.h>

#include "oneapi/dnnl/dnnl_graph.hpp"

#include "example_utils.hpp"
#include "graph_example_utils.hpp"

using namespace dnnl::graph;
using data_type = logical_tensor::data_type;
using layout_type = logical_tensor::layout_type;
using dim = logical_tensor::dim;
using dims = logical_tensor::dims;

cpu_single_op_partition_tutorial() function

Build Graph and Get Partitions

In this section, we are trying to create a partition containing the single op matmul without building a graph and getting partition.

Create the Matmul op ( dnnl::graph::op) and attaches attributes to it, including transpose_a and transpose_b.

logical_tensor matmul_src0_desc {0, data_type::f32};
logical_tensor matmul_src1_desc {1, data_type::f32};
logical_tensor matmul_dst_desc {2, data_type::f32};
op matmul(0, op::kind::MatMul, {matmul_src0_desc, matmul_src1_desc},
        {matmul_dst_desc}, "matmul");
matmul.set_attr<bool>(op::attr::transpose_a, false);
matmul.set_attr<bool>(op::attr::transpose_b, false);

Compile and Execute Partition

In the real case, users like framework should provide device information at this stage. But in this example, we just use a self-defined device to simulate the real behavior.

Create a dnnl::engine. Also, set a user-defined dnnl::graph::allocator to this engine.

allocator alloc {};
dnnl::engine eng
        = make_engine_with_allocator(dnnl::engine::kind::cpu, 0, alloc);

Create a dnnl::stream on a given engine

dnnl::stream strm {eng};

Skip building graph and getting partition, and directly create the single-op partition

partition part(matmul, dnnl::engine::kind::cpu);

Compile the partition to generate compiled partition with the input and output logical tensors.

compiled_partition cp = part.compile(inputs, outputs, eng);

Execute the compiled partition on the specified stream.

cp.execute(strm, inputs_ts, outputs_ts);

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Deep Neural Network Developer Guide and Reference

Single op partition on CPU

Public headers

cpu_single_op_partition_tutorial() function