Developer Guide and Reference

  • 2022.1
  • 04/11/2022
  • Public Content
Contents

LogSoftmax

General

This functionality is deprecated and will be removed in future releases.
The logsoftmax primitive performs softmax along a particular axis on data with arbitrary dimensions followed by the logarithm function. All other axes are treated as independent (batch).
Forward
In general form, the operation is defined by the following formulas (the variable names follow the standard Naming Conventions). Second form is used as more numerically stable:
LaTex Math image.
where
  • LaTex Math image. axis over which the logsoftmax computation is computed on,
  • LaTex Math image. is the outermost index (to the left of logsoftmax axis),
  • LaTex Math image. is the innermost index (to the right of logsoftmax axis), and
  • LaTex Math image. is used to produce more accurate results and defined as:
    LaTex Math image.
Difference Between Forward Training and Forward Inference
There is no difference between the dnnl_forward_training and dnnl_forward_inference propagation kinds.
Backward
The backward propagation computes LaTex Math image., based on LaTex Math image. and LaTex Math image..

Execution Arguments

When executed, the inputs and outputs should be mapped to an execution argument index as specified by the following table.
Primitive input/output
Execution argument index
LaTex Math image.
DNNL_ARG_SRC
LaTex Math image.
DNNL_ARG_DST
LaTex Math image.
DNNL_ARG_DIFF_SRC
LaTex Math image.
DNNL_ARG_DIFF_DST

Implementation Details

General Notes
  1. Both forward and backward propagation support in-place operations, meaning that
    src
    can be used as input and output for forward propagation, and
    diff_dst
    can be used as input and output for backward propagation. In case of in-place operation, the original data will be overwritten.
Post-Ops and Attributes
The logsoftmax primitive does not support any post-ops or attributes.
Data Type Support
The logsoftmax primitive supports the following combinations of data types:
Propagation
Source / Destination
forward / backward
bf16, f32
Data Representation
Source, Destination, and Their Gradients
The logsoftmax primitive works with arbitrary data tensors. There is no special meaning associated with any logical dimensions. However, the logsoftmax axis is typically referred to as channels (hence in formulas we use LaTex Math image.).

Implementation Limitations

  1. No primitive specific limitations. Refer to Data Types for limitations related to data types support.
  2. GPU
    • No support.

Performance Tips

  1. Use in-place operations whenever possible.
  2. Currently the softmax primitive is optimized for the cases where the dimension of the softmax axis is physically dense. For instance:
    • Optimized: 2D case, tensor LaTex Math image., softmax axis 1 (B), format tag dnnl_ab
    • Optimized: 4D case, tensor LaTex Math image., softmax axis 3 (D), format tag dnnl_abcd
    • Optimized: 4D case, tensor LaTex Math image., softmax axis 1 (B), format tag dnnl_abcd, and LaTex Math image.
    • Optimized: 4D case, tensor LaTex Math image., softmax axis 1 (B), format tag dnnl_acdb or dnnl_aBcd16b, and LaTex Math image.
    • Non-optimized: 2D case, tensor LaTex Math image., softmax axis 0 (A), format tag dnnl_ab, and LaTex Math image.
    • Non-optimized: 2D case, tensor LaTex Math image., softmax axis 1 (B), format tag dnnl_ba, and LaTex Math image.
    • Non-optimized: 4D case, tensor LaTex Math image., softmax axis 2 (C), format tag dnnl_acdb, and and LaTex Math image.

Example

This C++ API example demonstrates how to create and execute a Logsoftmax primitive in forward training propagation mode.
Key optimizations included in this example:
  • In-place primitive execution;
  • Softmax along axis 1 (C) for 2D tensors.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.