Developer Guide and Reference

  • 2022.1
  • 04/11/2022
  • Public Content
Contents

Shuffle

General

The shuffle primitive shuffles data along the shuffle axis (here is designated as LaTex Math image.) with the group parameter LaTex Math image.. Namely, the shuffle axis is thought to be a 2D tensor of size LaTex Math image. and it is being transposed to LaTex Math image.. Variable names follow the standard Naming Conventions.
The formal definition is shown below:
Forward
LaTex Math image.
where
  • LaTex Math image. dimension is called a shuffle axis,
  • LaTex Math image. is a
    group_size
    ,
  • LaTex Math image. is the outermost indices (to the left from shuffle axis),
  • LaTex Math image. is the innermost indices (to the right from shuffle axis), and
  • LaTex Math image. and LaTex Math image. relate to each other as define by the system:
    LaTex Math image.
Here, LaTex Math image. and LaTex Math image..
Difference Between Forward Training and Forward Inference
There is no difference between the dnnl_forward_training and dnnl_forward_inference propagation kinds.
Backward
The backward propagation computes LaTex Math image., based on LaTex Math image..
Essentially, backward propagation is the same as forward propagation with LaTex Math image. replaced by LaTex Math image..

Execution Arguments

When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.
Primitive input/output
Execution argument index
LaTex Math image.
DNNL_ARG_SRC
LaTex Math image.
DNNL_ARG_DST
LaTex Math image.
DNNL_ARG_DIFF_SRC
LaTex Math image.
DNNL_ARG_DIFF_DST

Implementation Details

General Notes
  1. The memory format and data type for
    src
    and
    dst
    are assumed to be the same, and in the API are typically referred as
    data
    (e.g., see
    data_desc
    in dnnl::shuffle_forward::desc::desc()). The same holds for
    diff_src
    and
    diff_dst
    . The corresponding memory descriptors are referred to as
    diff_data_desc
    .

Data Types

The shuffle primitive supports the following combinations of data types:
Propagation
Source / Destination
forward / backward
f32, bf16
forward
s32, s8, u8
There might be hardware and/or implementation specific restrictions. Check the Implementation Limitations section below.

Data Layouts

The shuffle primitive works with arbitrary data tensors. There is no special meaning associated with any logical dimensions. However, the shuffle axis is typically referred to as channels (hence in formulas we use LaTex Math image.).
Shuffle operation typically appear in CNN topologies. Hence, in the library the shuffle primitive is optimized for the corresponding memory formats:
Spatial
Logical tensor
Shuffle Axis
Implementations optimized for memory formats
2D
NCHW
1 (C)
dnnl_nchw ( dnnl_abcd ), dnnl_nhwc ( dnnl_acdb ),
optimized^
3D
NCDHW
1 (C)
Here optimized^ means the format that comes out of any preceding compute-intensive primitive.
Post-Ops and Attributes
The shuffle primitive does not support any post-ops or attributes.

Implementation Limitations

  1. Refer to Data Types for limitations related to data types support.

Performance Tips

N/A

Example

This C++ API example demonstrates how to create and execute a Shuffle primitive.
Key optimizations included in this example:
  • Shuffle along axis 1 (channels).

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.