OpenVINO™ toolkit: An open source AI toolkit that makes it easier to write once, deploy anywhere.

What's New in Version 2025.3

OpenVINO™ 2025.3 takes your AI deployments to the next level with new features and performance enhancements. In this release, you’ll see continuous improvements for large language models (LLMs), optimized runtimes for Intel® hardware, and expanded capabilities for efficient AI deployment across edge, cloud, and local environments. Explore the latest updates and unlock new possibilities for your AI projects.

Release Notes

View System Requirements

Latest Features

Easier Model Access and Conversion
We’ve made model conversion easier.

Product	Details
New Model Support	New models supported: Phi-4-mini-reasoning, AFM-4.5B, Gemma-3-1B, Gemma-3-4B, and Gemma-3-12B
NPU Support	NPU support added for: Qwen3-1.7B, Qwen3-4B, Qwen3-8B LLMs optimized for NPU now available on OpenVINO Hugging Face collection.

GenAI and LLM Enhancements

Expanded model support and accelerated inference.

Feature	Details
NPU Plug-in Support	The NPU plug-in adds support for longer contexts of up to 8K tokens, dynamic prompts, and dynamic LoRA for improved LLM performance. The NPU plug-in now supports dynamic batch sizes by reshaping the model to a batch size of 1 and concurrently managing multiple inference requests, enhancing performance and optimizing memory utilization.
GenAI Enhancements	Accuracy improvements for GenAI models on both built-in and discrete graphics achieved through the implementation of the key cache compression per channel technique, in addition to the existing KV cache per-token compression method. OpenVINO™ GenAI introduces TextRerankPipeline for improved retrieval relevance and RAG pipeline accuracy, plus Structured Output for enhanced response reliability and function calling while ensuring adherence to predefined formats.

Feature

Details

NPU Plug-in Support

The NPU plug-in adds support for longer contexts of up to 8K tokens, dynamic prompts, and dynamic LoRA for improved LLM performance.
The NPU plug-in now supports dynamic batch sizes by reshaping the model to a batch size of 1 and concurrently managing multiple inference requests, enhancing performance and optimizing memory utilization.

GenAI Enhancements

Accuracy improvements for GenAI models on both built-in and discrete graphics achieved through the implementation of the key cache compression per channel technique, in addition to the existing KV cache per-token compression method.
OpenVINO™ GenAI introduces TextRerankPipeline for improved retrieval relevance and RAG pipeline accuracy, plus Structured Output for enhanced response reliability and function calling while ensuring adherence to predefined formats.

More Portability and Performance

Develop once, deploy anywhere. OpenVINO toolkit enables developers to run AI at the edge, in the cloud, or locally.

Product	Details
Intel® Hardware Support	Preview: Intel® Core™ Ultra Processor and Windows-based AI PCs can now leverage the OpenVINO™ Execution Provider for Windows* ML for high-performance, off-the-shelf starting experience on Windows*. Announcing support for Intel® Arc™ Pro B-Series (B50 and B60)
Model Server Updates	Preview: Hugging Face models that are GGUF-enabled for OpenVINO GenAI are now supported by the OpenVINO™ Model Server for popular LLM model architectures such as DeepSeek Distill, Qwen2, Qwen2.5, and Llama 3. This functionality reduces memory footprint and simplifies integration for GenAI workloads. With improved reliability and tool call accuracy, the OpenVINO™ Model Server boosts support for agentic AI use cases on AI PCs, while enhancing performance on Intel CPUs, built-in GPUs, and NPUs.
NNCF Updates	Int4 data-aware weights compression, now supported in the Neural Network Compression Framework (NNCF) for ONNX models, reduces memory footprint while maintaining accuracy and enables efficient deployment in resource-constrained environments.

Sign Up for Exclusive News, Tips & Releases

Be among the first to learn about everything new with the Intel® Distribution of OpenVINO™ toolkit. By signing up, you get early access product updates and releases, exclusive invitations to webinars and events, training and tutorial resources, and other breaking news.