Introduction
This package contains the Intel® Distribution of OpenVINO™ Toolkit software version 2025.4 for Linux*, Windows*, and macOS*.
Available Downloads
- macOS*
- Size: 43.2 MB
- SHA256: 918A20A6613AA2C690A69CF753403D84970D1787986B1F01C5B951AF9EEFD8EC
- macOS*
- Size: 52.4 MB
- SHA256: 8A5209B22006B1320D28212E825182A4CB5D2B12F318111181044273F91F0471
- Linux*
- Size: 33 MB
- SHA256: 76C1361A31BC0972AB5A2BAB6EFC2CD449D49B728DE4AA7F33EBC0F5895764A9
- Size: 57.5 MB
- SHA256: AD8D73B4B0776761633D8FE40167BB3373605AB59998EFC2C2DA428D3DE028DB
- Linux*
- Size: 70.5 MB
- SHA256: D14B59C695312663764903277BC445CBBC0EBBD2789B85905B9273CA269942BE
- Linux*
- Size: 36.7 MB
- SHA256: D51ED6EB38933AD5E6424DEBE734568B758D0FD9B9DF1B21098CFB3F20D20CD3
- Linux*
- Size: 63.8 MB
- SHA256: 0C30763175A0906CDA365751C1B7A986A313FB10A0CEF4E7637FF5034D954034
- Linux*
- Size: 58.6 MB
- SHA256: C57BC759A04BF316D66DC42D644433CF3BD590AD640933630A0616EFB380A630
- Microsoft Windows*
- Size: 125.6 MB
- SHA256: 995A88DC1E34DF841CC4DB5AB118A87147608C3E4B67F7D9D86BEF1B311A273E
- Microsoft Windows*
- Size: 682.1 MB
- SHA256: 74B072736A803846EFF819BE353709ADBEA60020DC5D41AAF4B509C3697FEF5A
Detailed Description
What’s new
- More Gen AI coverage and frameworks integrations to minimize code changes
- New models supported:
- On CPUs & GPUs: Qwen3-Embedding-0.6B, Qwen3-Reranker-0.6B, Mistral-Small-24B-Instruct-2501.
- On NPUs: Gemma-3-4b-it and Qwen2.5-VL-3B-Instruct.
- Preview: Mixture of Experts (MoE) models optimized for CPUs and GPUs, validated for Qwen3-30B-A3B.
- GenAI pipeline integrations: Qwen3-Embedding-0.6B and Qwen3-Reranker-0.6B for enhanced retrieval/ranking, and Qwen2.5VL-7B for video pipeline.
- New models supported:
- Broader LLM model support and more model compression techniques
- Gold support for Windows ML* enables developers to deploy AI models and applications effortlessly across CPUs, GPUs, and NPUs on Intel® Core™ Ultra processor-powered AI PCs.
- The Neural Network Compression Framework (NNCF) ONNX backend now supports INT8 static post-training quantization (PTQ) and INT8/INT4 weight-only compression to ensure accuracy parity with OpenVINO IR format models. SmoothQuant algorithm support added for INT8 quantization.
- Accelerated multi-token generation for GenAI, leveraging optimized GPU kernels to deliver faster inference, smarter KV-cache reuse, and scalable LLM performance.
- GPU plugin updates include improved performance with prefix caching for chat history scenarios and enhanced LLM accuracy with dynamic quantization support for INT8.
- More portability and performance to run AI at the edge, in the cloud or locally
- Announcing support for Intel® Core™ Ultra Processor Series 3.
- Encrypted blob format support added for secure model deployment with OpenVINO™ GenAI. Model weights and artifacts are stored and transmitted in an encrypted format, reducing risks of IP theft during deployment. Developers can deploy with minimal code changes using OpenVINO GenAI pipelines.
- OpenVINO™ Model Server and OpenVINO™ GenAI now extend support for Agentic AI scenarios with new features such as output parsing and improved chat templates for reliable multi-turn interactions, and preview functionality for the Qwen3-30B-A3B model. OVMS also introduces a preview for audio endpoints.
- NPU deployment is simplified with batch support, enabling seamless model execution across Intel® Core™ Ultra processors while eliminating driver dependencies. Models are reshaped to batch_size=1 before compilation.
- The improved NVIDIA Triton Server* integration with OpenVINO backend now enables developers to utilize Intel GPUs or NPUs for deployment.
Get all the details. See 2025.4 release notes.
Installation instructions
You can choose how to install OpenVINO™ Runtime from Archive* according to your operating system:
- Install OpenVINO Runtime on Linux*
- Install OpenVINO Runtime on Windows*
- Install OpenVINO Runtime on macOS*
What's included in the download package (Archive File)
- Offers both C/C++ and Python APIs
- Additionally includes code samples
Helpful Links
NOTE: Links open in a new window.
Disclaimers1
Product and Performance Information
Intel is in the process of removing non-inclusive language from our current documentation, user interfaces, and code. Please note that retroactive changes are not always possible, and some non-inclusive language may remain in older documentation, user interfaces, and code.