oneDNN algorithm kind
Backward formula (from src)
Backward formula (from dst)
- The memory format and data type for and are assumed to be the same, and in the API are typically referred asdata(e.g., seedata_descin dnnl::eltwise_forward::desc::desc()). The same holds for and . The corresponding memory descriptors are referred to asdiff_data_desc.
- Both forward and backward propagation support in-place operations, meaning that can be used as input and output for forward propagation, and can be used as input and output for backward propagation. In case of an in-place operation, the original data will be overwritten. Note, however, that some algorithms for backward propagation require original , hence the corresponding forward propagation should not be performed in-place for those algorithms. Algorithms that use for backward propagation can be safely done in-place.
- For some operations it might be beneficial to compute backward propagation based on , rather than on , for improved performance.
- For logsigmoid original formula was replaced by for numerical stability.
Source / Destination
Intermediate data type
forward / backward
s32 / s8 / u8
- For backward propagation, use the same memory format for , , and (the format of the and are always the same because of the API). Different formats are functionally supported but lead to highly suboptimal performance.
- Use in-place operations whenever possible (see caveats in General Notes).
- As mentioned above for all operations supporting destination memory as input, one can use the tensor instead of . This enables the following potential optimizations for training:
- Such operations can be safely done in-place.
- Moreover, such operations can be fused as a post-op with the previous operation if that operation does not require its to compute the backward propagation (e.g., if the convolution operation satisfies these conditions).