Why Quantized Model Format Remained FP32 Instead INT8?
Content Type: Product Information & Documentation | Article ID: 000095064 | Last Reviewed: 06/13/2023
During quantization only required operations in perspective of performance were being quantized. The remaining operations will remain as FP32 in the output.
Refer to OpenVINO™ Low Precision Transformation.