OpenVINO™ toolkit: An open source AI toolkit that makes it easier to write once, deploy anywhere.
What's New in Version 2025.3
OpenVINO™ 2025.3 takes your AI deployments to the next level with new features and performance enhancements. In this release, you’ll see continuous improvements for large language models (LLMs), optimized runtimes for Intel® hardware, and expanded capabilities for efficient AI deployment across edge, cloud, and local environments. Explore the latest updates and unlock new possibilities for your AI projects.
Latest Features
Easier Model Access and Conversion
We’ve made model conversion easier.
Product |
Details |
---|---|
New Model Support |
New models supported: Phi-4-mini-reasoning, AFM-4.5B, Gemma-3-1B, Gemma-3-4B, and Gemma-3-12B |
NPU Support |
|
GenAI and LLM Enhancements
Expanded model support and accelerated inference.
Feature |
Details |
---|---|
NPU Plug-in Support |
|
GenAI Enhancements |
|
More Portability and Performance
Develop once, deploy anywhere. OpenVINO toolkit enables developers to run AI at the edge, in the cloud, or locally.
Product |
Details |
---|---|
Intel® Hardware Support |
|
Model Server Updates |
|
NNCF Updates |
Int4 data-aware weights compression, now supported in the Neural Network Compression Framework (NNCF) for ONNX models, reduces memory footprint while maintaining accuracy and enables efficient deployment in resource-constrained environments. |
Sign Up for Exclusive News, Tips & Releases
Be among the first to learn about everything new with the Intel® Distribution of OpenVINO™ toolkit. By signing up, you get early access product updates and releases, exclusive invitations to webinars and events, training and tutorial resources, and other breaking news.
Resources
Community and Support
Explore ways to get involved and stay up-to-date with the latest announcements.
Get Started
Optimize, fine-tune, and run comprehensive AI inference using the included model optimizer and runtime and development tools.
The productive smart path to freedom from the economic and technical burdens of proprietary alternatives for accelerated computing.