Skip To Main Content
Support Knowledge Base

Conversion of INT8 Models to Intermediate Representation (IR)

Content Type: Troubleshooting   |   Article ID: 000058759   |   Last Reviewed: 03/05/2026

Description

In the Model Optimization documentation, quantization‑aware training (QAT) is mentioned. It states that QAT allows a user to obtain an accurate optimized model that can be converted to OpenVINO™ Intermediate Representation (IR). However, no additional details are provided. Refer to:

Resolution

Quantization‑Aware Training (QAT), using OpenVINO™‑compatible training frameworks, is supported through Neural Network Compression Framework (NNCF) for:

  • TensorFlow* 2 / Keras* models (via NNCF QAT workflow)
  • PyTorch* models (via NNCF QAT workflow) 

NNCF is a framework that provides post‑training and training‑time model compression methods (including QAT) and is used to optimize models for OpenVINO inference.

After QAT fine‑tuning is complete, the optimized model can be exported (commonly to ONNX*) and then converted to OpenVINO™ IR for deployment.

NoteThe transition to INT8 precision and the corresponding footprint benefits occur after converting the model to OpenVINO IR.

Related Products

This article applies to 1 products.