Intel® oneAPI Deep Neural Network Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
LayerNormBackward
General
LayerNormBackward performs the backward of LayerNorm operation.
The backward propagation computes  ,
,  , and
, and  based on
 based on  ,
,  ,
,  ,
,  ,
,  , and
, and  .
.
The tensors marked with an asterisk are used only when the operation is configured to use  , and
, and 
Operation attributes
| Attribute Name | Description | Value Type | Supported Values | Required or Optional | 
|---|---|---|---|---|
| begin_norm_axis is used to indicate which axis to start layer normalization. The normalization is from begin_norm_axis to last dimension. Negative values means indexing from right to left. This op normalizes over the last dimension by default, e.g. C in TNC for 3D and LDNC for 4D. | s64 | [-r,r-1],where r=rank(src). -1 is default | Optional | |
| When set to True, this module has learnable per-element affine parameters. | bool | false , true (default) | Optional | |
| The constant to improve numerical stability. | f32 | Arbitrary positive f32 value, 1e-5 (default) | Optional | 
Execution arguments
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs
| Index | Argument Name | Required or Optional | 
|---|---|---|
| 0 | src | Required | 
| 1 | diff_dst | Required | 
| 2 | mean | Required | 
| 3 | variance | Required | 
| 4 | gamma | Optional | 
| 5 | beta | Optional | 
Outputs
| Index | Argument Name | Required or Optional | 
|---|---|---|
| 0 | diff_src | Required | 
| 1 | diff_gamma | Optional | 
| 2 | diff_beta | Optional | 
Supported data types
LayerNormBackward operation supports the following data type combinations.
| Src / Diff_dst / Diff_src | Gamma / Beta / Mean / Variance / Diff_gamma / Diff_beta | 
|---|---|
| f32 | f32 | 
| bf16 | f32, bf16 | 
| f16 | f32 |