Skip To Main Content
Support Knowledge Base

Inference Time Increases After Converting Whisper Large v3 Fp32 Model to Lower Precisions

Content Type: Compatibility   |   Article ID: 000100845   |   Last Reviewed: 05/22/2025

Description

  • Converted openai/whisper-large-v3 FP32 model to FP16, INT8, and INT4.
  • The time taken for inference was more than that taken on FP32.

Resolution

The Whisper Large V3 model is not enabled for the following precision:

  • INT8
  • INT4
  • INT2

Related Products

This article applies to 3 products.
Intel® Xeon Phi™ Processor Software OpenVINO™ toolkit Performance Libraries