enum dnnl_normalization_flags_t
Overview
Flags for normalization primitives. More…
#include <dnnl_types.h> enum dnnl_normalization_flags_t { dnnl_normalization_flags_none = 0x0U, dnnl_use_global_stats = 0x1U, dnnl_use_scaleshift = 0x2U, dnnl_fuse_norm_relu = 0x4U, dnnl_use_scale = 0x8U, dnnl_use_shift = 0x10U, };
Detailed Documentation
Flags for normalization primitives.
Enum Values
dnnl_normalization_flags_none
Use no normalization flags.
If specified
- on forward training propagation mean and variance are computed and
stored as output
- on backward propagation compute full derivative wrt data
- on backward propagation prop_kind == #dnnl_backward_data has the same
behavior as prop_kind == #dnnl_backward
dnnl_use_global_stats
Use global statistics.
If specified
- on forward propagation use mean and variance provided by user (input)
- on backward propagation reduces the amount of computations, since
mean and variance are considered as constants
If not specified:
- on forward propagation mean and variance are computed and stored as
output
- on backward propagation compute full derivative wrt data
dnnl_use_scaleshift
Use scale and shift parameters.
If specified:
- on forward propagation use scale and shift (aka scale and bias) for
the normalization results
- on backward propagation (for prop_kind == #dnnl_backward) compute
diff wrt scale and shift (hence one extra output used)
If no specified:
- on backward propagation prop_kind == #dnnl_backward_data has the
same behavior as prop_kind == #dnnl_backward
dnnl_fuse_norm_relu
Fuse with ReLU.
The flag implies negative slope being 0. On training this is the only
configuration supported. For inference, to use non-zero negative slope
consider using @ref dev_guide_attributes_post_ops.
If specified:
- on inference this option behaves the same as if the primitive were
fused with ReLU using post ops API with zero negative slope.
- on training primitive requires workspace (required to be able to
perform backward pass)
dnnl_use_scale
Use scale parameter.
If specified:
- on forward propagation use scale for the normalization results
- on backward propagation (for prop_kind == #dnnl_backward) compute
diff wrt scale (hence one extra output used)
dnnl_use_shift
Use shift parameter.
If specified:
- on forward propagation use shift (aka bias) for the normalization
results
- on backward propagation (for prop_kind == #dnnl_backward) compute
diff wrt shift (hence one extra output used)