Support Knowledge Base

Why Quantized Model Format Remained FP32 Instead INT8?

Content Type: Product Information & Documentation | Article ID: 000095064 | Last Reviewed: 06/13/2023

Description Resolution Additional information

Description

Quantized ONNX model with FP32 precision format.
Ran the compress_model_weights function to reduce the size of the bin file after performing Post-Training Quantization.
Compiled the model and noticed that the output of the model is in FP32 instead of INT8.

Resolution

During quantization only required operations in perspective of performance were being quantized. The remaining operations will remain as FP32 in the output.

Additional information

Refer to OpenVINO™ Low Precision Transformation.

Related Products

This article applies to 3 products.

Intel® Xeon Phi™ Processor Software OpenVINO™ toolkit Performance Libraries

Need more help?

Contact support