Support Knowledge Base

Unable To Get Output With INT8 and INT4 Quantized Model on GPU

Content Type: Troubleshooting | Article ID: 000100589 | Last Reviewed: 05/21/2025

Description Resolution

OpenVINO 2024.0

Description

Installed OpenVINO™ 2024.0.
Used the optimum-intel package to convert the whisper-large-v3 model to int 4 and int8 and ran inference with OpenVINO™ on GPU.
No output is available.

Use OpenVINO™ version 2024.5.0 and higher to run the quantized models.

This article applies to 3 products.

Intel® Xeon Phi™ Processor Software OpenVINO™ toolkit Performance Libraries

Contact support