struct dnnl::primitive

Intel® oneAPI Deep Neural Network Developer Guide and Reference

Download PDF

ID 768875

Date 6/30/2025

Version

Public

struct dnnl::primitive_attr

Overview

Primitive attributes. More…

#include <dnnl.hpp>

struct primitive_attr: public dnnl::handle
{
    // construction

    primitive_attr();
    primitive_attr(dnnl_primitive_attr_t attr);

    // methods

    void get_dropout(memory::desc& mask_desc) const;
    void set_dropout(const memory::desc& mask_desc);
    fpmath_mode get_fpmath_mode() const;
    void get_fpmath_mode(fpmath_mode& mode, bool& apply_to_int) const;
    void set_fpmath_mode(fpmath_mode mode, bool apply_to_int = false);
    accumulation_mode get_accumulation_mode() const;
    void set_accumulation_mode(accumulation_mode mode);
    bool get_deterministic() const;
    void set_deterministic(bool value);
    rounding_mode get_rounding_mode(int arg) const;
    void set_rounding_mode(int arg, rounding_mode mode);
    scratchpad_mode get_scratchpad_mode() const;
    void set_scratchpad_mode(scratchpad_mode mode);
    void set_scales_mask(int arg, int mask);

    void set_scales(
        int arg,
        int mask,
        const memory::dims& groups,
        memory::data_type data_type = memory::data_type::f32
        );

    void set_zero_points_mask(int arg, int mask);

    void set_zero_points(
        int arg,
        int mask,
        const memory::dims& groups,
        memory::data_type data_type = memory::data_type::s32
        );

    post_ops get_post_ops() const;
    void set_post_ops(const post_ops& ops);
    void set_rnn_data_qparams(float scale, float shift);
    void get_rnn_data_qparams(float& scale, float& shift);
    void set_rnn_weights_qparams(int mask, const std::vector<float>& scales);
    void get_rnn_weights_qparams(int& mask, std::vector<float>& scales);

    void set_rnn_weights_projection_qparams(
        int mask,
        const std::vector<float>& scales
        );

    void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales);
};

Inherited Members

public:
    // methods

    handle<T, traits>& operator = (const handle<T, traits>&);
    handle<T, traits>& operator = (handle<T, traits>&&);
    void reset(T t, bool weak = false);
    T get(bool allow_empty = false) const;
    operator T () const;
    operator bool () const;
    bool operator == (const handle<T, traits>& other) const;
    bool operator != (const handle& other) const;

Detailed Documentation

Primitive attributes.

See also:

Primitive Attributes

Construction

primitive_attr()

Constructs default (empty) primitive attributes.

primitive_attr(dnnl_primitive_attr_t attr)

Creates primitive attributes from a C API dnnl_primitive_attr_t handle.

The resulting handle is not weak and the C handle will be destroyed during the destruction of the C++ object.

Parameters:

attr	The C API primitive attributes.

Methods

void get_dropout(memory::desc& mask_desc) const

Returns the parameters of a dropout attribute.

Parameters:

mask_desc

Output memory descriptor of a dropout mask.

void set_dropout(const memory::desc& mask_desc)

Sets dropout probability.

Parameters:

mask_desc

Output memory descriptor of a dropout mask.

fpmath_mode get_fpmath_mode() const

Returns the fpmath mode.

void get_fpmath_mode(fpmath_mode& mode, bool& apply_to_int) const

Returns the fpmath mode.

Parameters:

mode	Specified fpmath mode.
apply_to_int	Use floating-point arithmetic for integer primitives.

void set_fpmath_mode(fpmath_mode mode, bool apply_to_int = false)

Sets fpmath mode.

Parameters:

mode	Specified fpmath mode.
apply_to_int	Boolean. Use of floating-point arithmetic for integer primitives.

accumulation_mode get_accumulation_mode() const

Returns the accumulation mode.

void set_accumulation_mode(accumulation_mode mode)

Sets accumulation mode.

Parameters:

mode	Specified accumulation mode.

bool get_deterministic() const

Returns the deterministic attribute value.

void set_deterministic(bool value)

Sets deterministic attribute value.

Parameters:

value

Specified deterministic mode.

rounding_mode get_rounding_mode(int arg) const

Returns the rounding mode attribute value.

Parameters:

arg	Argument for which rounding mode query applies.

Returns:

The rounding mode applied to the specified argument.

void set_rounding_mode(int arg, rounding_mode mode)

Sets the rounding mode attribute value for a given argument.

Parameters:

arg	Argument for which to set rounding mode.
mode	Rounding mode to apply.

scratchpad_mode get_scratchpad_mode() const

Returns the scratchpad mode.

void set_scratchpad_mode(scratchpad_mode mode)

Sets scratchpad mode.

Parameters:

mode	Specified scratchpad mode.

void set_scales_mask(int arg, int mask)

Sets scaling factors for primitive operations for a given memory argument.

The scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_SCALES | arg.

Parameters:

arg	Parameter argument index as passed to the primitive::execute() call.
mask	Scaling factors correspondence mask that defines the correspondence between the tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.

See also:

dnnl_primitive_attr_set_scales_mask

void set_scales(
    int arg,
    int mask,
    const memory::dims& groups,
    memory::data_type data_type = memory::data_type::f32
    )

Sets scaling factors for primitive operations for a given memory argument.

The scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_SCALES | arg.

Parameters:

arg	Parameter argument index as passed to the primitive::execute() call.
mask	Scales correspondence mask that defines the correspondence between the tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scale is used for each index along that dimension. Set the mask to 0 to use a common scale for the whole output tensor.
groups	Scaling factors correspondence groups that define the correspondence between the tensor dimensions and the scales array. The set i-th dimension indicates a number of groups of scaling factors used for that logical dimension in a memory indicated by `arg`.
data_type	Scaling factors data_type.

See also:

dnnl_primitive_attr_set_scales

void set_zero_points_mask(int arg, int mask)

Sets zero points for primitive operations for a given memory argument.

The zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS | arg.

Parameters:

arg	Parameter argument index as passed to the primitive::execute() call.
mask	Zero point correspondence mask that defines the correspondence between the tensor dimensions and the `zero_points` vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor.

void set_zero_points(
    int arg,
    int mask,
    const memory::dims& groups,
    memory::data_type data_type = memory::data_type::s32
    )

Sets zero points for primitive operations for a given memory argument.

The zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS | arg.

Parameters:

arg	Parameter argument index as passed to the primitive::execute() call.
mask	Zero point correspondence mask that defines the correspondence between the tensor dimensions and the `zero_points` vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor.
groups	Zero point factors correspondence groups that define the correspondence between the tensor dimensions and the zero_points array. The set i-th dimension indicates a number of groups of zero point factors used for that logical dimension in a memory indicated by `arg`.
data_type	Zero point factors data_type.

See also:

dnnl_primitive_attr_set_zero_points

post_ops get_post_ops() const

Returns post-ops previously set via set_post_ops().

Returns:

Post-ops.

void set_post_ops(const post_ops& ops)

Sets post-ops.

NOTE:

There is no way to check whether the post-ops would be supported by the target primitive. Any error will be reported by the respective primitive descriptor constructor.

Parameters:

ops	Post-ops object to copy post-ops from.

void set_rnn_data_qparams(float scale, float shift)

Sets quantization scale and shift parameters for RNN data tensors.

For performance reasons, the low-precision configuration of the RNN primitives expect input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.

The quantization formula is scale * data + shift.

Example usage:

// RNN parameters
int l = 2, t = 2, mb = 32, sic = 32, slc = 32, dic = 32, dlc = 32;
// Activations quantization parameters
float scale = 63.f, shift = 64.f;

primitive_attr attr;

// Set scale and shift for int8 quantization of activation
attr.set_rnn_data_qparams(scale, shift);

// Create an RNN primitive descriptor.
vanilla_rnn_forward::primitive_desc rnn_d(
        engine, /* arguments */, attr);

NOTE:

Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.

Parameters:

scale	The value to scale the data by.
shift	The value to shift the data by.

void get_rnn_data_qparams(float& scale, float& shift)

Returns the quantization scale and shift parameters for RNN data tensors.

NOTE:

Quantization scale and shift are common for src_layer, src_iter, dst_iter, and dst_layer.

Parameters:

scale	The value to scale the data by.
shift	The value to shift the data by.

void set_rnn_weights_qparams(int mask, const std::vector<float>& scales)

Sets quantization scaling factors for RNN weights tensors.

The low-precision configuration of the RNN primitives expect input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

NOTE:

Quantization scales are common for weights_layer and weights_iteration

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.

void get_rnn_weights_qparams(int& mask, std::vector<float>& scales)

Returns the quantization scaling factors for RNN projection weights tensors.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.

void set_rnn_weights_projection_qparams(
    int mask,
    const std::vector<float>& scales
    )

Sets quantization scaling factors for RNN projection weights tensors.

passed to RNN primitives using attributes.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

NOTE:

Quantization scales are common for weights_layer and weights_iteration

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.

void get_rnn_weights_projection_qparams(int& mask, std::vector<float>& scales)

Returns the quantization scaling factors for RNN projection weights tensors.

NOTE:

The dimension order is always native and does not depend on the actual layout used. For example, five-dimensional weights always have (l, d, i, g, o) logical dimension ordering.

Parameters:

mask	Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the `scales` vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor.
scales	Constant vector of output scaling factors. The following equality must hold: Violations can only be detected when the attributes are used to create a primitive descriptor.

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® oneAPI Deep Neural Network Developer Guide and Reference

struct dnnl::primitive_attr

Overview

Detailed Documentation