Intel® oneAPI Deep Neural Network Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
DynamicDequantize
General
The Dynamic Dequantize operation converts a quantized (s4, u4, s8, or u8) tensor to an bf16, f16 or f32 tensor. It supports per-tensor, per-channel, and per-group asymmetric linear de-quantization. The rounding mode is defined by the library implementation. Unlike the Dequantize, Dynamic Dequantize takes scales and zero-points as operator src tensors.
For per-tensor de-quantization
 
   For per-channel de-quantization, taking channel axis = 1 as an example:
 
   For per-group de-quantization, let’s take group shape = Gx1 as an example. It indicates that one scaling factor will de adopted for G elements in the src tensor. On the dimensions where group quantization is adopted, make channelNum equal to the dimension of src and groupNum equal to channelNum/group size:
 
   Where:
 
   On other dimensions:
 
   Operation attributes
Attribute Name  |  
        Description  |  
        Value Type  |  
        Supported Values  |  
        Required or Optional  |  
       
|---|---|---|---|---|
Specifies which de-quantization type is used.  |  
        string  |  
        per_tensor (default), per_channel  |  
        Optional  |  
       |
Specifies dimension on which per-channel de-quantization is applied.  |  
        s64  |  
        An s64 value in the range of [-r, r-1] where r = rank(src), 1 by default. Negative values mean counting the dimension backwards from the end.  |  
        Optional  |  
       |
Specifies the group shape of an operation.  |  
        s64  |  
        An s64 list indicates the group size on the dimensions where grouped quantization is adopted.  |  
        Optional  |  
       
Execution arguments
The inputs and outputs must be provided according to below index order when constructing an operation.
Inputs
Index  |  
        Argument Name  |  
        Required or Optional  |  
       
|---|---|---|
0  |  
        src  |  
        Required  |  
       
1  |  
        scales  |  
        Required  |  
       
2  |  
        zps  |  
        Optional  |  
       
Outputs
Index  |  
        Argument Name  |  
        Required or Optional  |  
       
|---|---|---|
0  |  
        dst  |  
        Required  |  
       
Supported data types
DynamicDequantize operation supports the following data type combinations.
Src  |  
        Dst  |  
        Scales  |  
        Zps  |  
       
|---|---|---|---|
s8  |  
        f16, bf16, f32  |  
        f16, bf16, f32  |  
        s8, u8, s32  |  
       
u8  |  
        f16, bf16, f32  |  
        f16, bf16, f32  |  
        s8, u8, s32  |  
       
s4  |  
        f16, bf16, f32  |  
        f16, bf16, f32  |  
        s4, u4, s32  |  
       
u4  |  
        f16, bf16, f32  |  
        f16, bf16, f32  |  
        s4, u4, s32  |  
       
It’s expected that the data types of scales and dst should be the same.