Article ID: 000095064 Content Type: Product Information & Documentation Last Reviewed: 06/13/2023

Why Quantized Model Format Remained FP32 Instead INT8?

BUILT IN - ARTICLE INTRO SECOND COMPONENT
Summary

Operation of quantization in OpenVINO™ toolkit.

Description
  • Quantized ONNX model with FP32 precision format.
  • Ran the compress_model_weights function to reduce the size of the bin file after performing Post-Training Quantization.
  • Compiled the model and noticed that the output of the model is in FP32 instead of INT8.
Resolution

During quantization only required operations in perspective of performance were being quantized. The remaining operations will remain as FP32 in the output. 

Additional information

Refer to OpenVINO™ Low Precision Transformation.

Related Products

This article applies to 1 products