Intel® Distribution of OpenVINO™ Toolkit Release Notes

ID 780177
Updated 4/9/2024

A newer version of this document is available. Customers should click here to go to the newest version.



New and Changed in 2023.0.2 

Summary of major features and improvements  

  • GNA library version upgraded from to 

Implemented fixes for known issues from 2023.0.1: 


The problem was triggered by code refactoring made in the past, which by accident removed support for GNA 1.0 generation. 
In consequence, GNA device would not work on Gemini Lake (GLK) platforms. 
It was fixed by reading the missing GNA 1.0 support to the code. 

PR: #18653 


Problem with memory leak during HLK test. 
The issue is triggered by faulty code in OpenVINO codebase in regard to releasing the previously allocated resources. 
It was fixed by refactoring the code to eliminate the problem. 

PR: #19257 


Issues occurred in Multi-Threading 2.0 getting CPU mapping detail on Windows 7 platforms. The problem was caused by the difference in the format of the OS logical processor information between Windows 7 and Windows 10/11. This resulted in errors in CPU inference on Windows 7. Corresponding fixes were implemented in Multi-Threading 2.0 of OpenVINO Core to support both Windows 7 format and Windows 10/11 format. Note that Windows 7 support ended on January 14, 2020. 

PR: #19110 


Issues occurred when compiling a Pytorch model with unfold op. The problem was caused by incorrectly transforming a Gather op into a Squeeze op. This resulted in compilation failure for similar models. Corresponding fixes were implemented in the OpenVINO ngraph transformation. 

PR: #19094 

New and Changed in 2023.0.1

Implemented fixes for two known issues from 2023.0:


Issues occurred in POT usage on Windows platforms. The problem was caused by accessing the same files in read and write modes. This resulted in errors on Windows due to the default usage of the MMap allocator (enabled in 2023.0). Corresponding fixes were implemented in POT and the accuracy checker to disable the MMap in OpenVINO Core. However, users may still encounter similar problems in their applications when simultaneously reading/writing to the same file. 


Follow this guide to disable the default switch to the MMap allocator. The issue is fixed for POT quantization on Windows, see the PR in the OpenVINO repository for more information.
113384 Due to OS specifics, it is not possible to properly handle the directory in read_model() on Windows. This results in a failure to directly import a TensorFlow saved_model directory using the read_model() call. 

OpenVINO Core

The issue is fixed in the following PR in the OpenVINO repository.

New and Changed in 2023.0

Summary of major features and improvements 

  • More integrations, minimizing code changes 
    • Now you can load TensorFlow and TensorFlow Lite models directly in OpenVINO Runtime and OpenVINO Model Server. Models are converted automatically. For maximum performance, it is still recommended to convert to OpenVINO Intermediate Representation or IR format before loading the model.  Additionally, we’ve introduced a similar functionality with PyTorch models as a preview feature where you can convert PyTorch models directly without needing to convert to ONNX.  
    • Support for Python 3.11  
    • NEW: C++ developers can now install OpenVINO runtime from Conda Forge​  
    • NEW: ARM processors are now supported in CPU plug-in, including dynamic shapes, full processor performance, and broad sample code/notebook coverage. Officially validated for Raspberry Pi 4 and Apple® Mac M1/M2 
    • Preview: A new Python API has been introduced to allow developers to convert and optimize models directly from Python scripts 
  • Broader model support and optimizations 
    • Expanded model support for generative AI: CLIP, BLIP, Stable Diffusion 2.0, text processing models, transformer models (i.e. S-BERT, GPT-J, etc.), and others of note: Detectron2, Paddle Slim, RNN-T, Segment Anything Model (SAM), Whisper, and YOLOv8 to name a few. 
    • Initial support for dynamic shapes on GPU - you no longer need to change to static shapes when leveraging the GPU which is especially important for NLP models. 
    • Neural Network Compression Framework (NNCF) is now the main quantization solution. You can use it for both post-training optimization and quantization-aware training. Try it out: pip install nncf 
  • Portability and performance​
    • CPU plugin now offers thread scheduling on 12th gen Intel® Core and up. You can choose to run inference on E-cores, P-cores, or both, depending on your application’s configurations.  It is now possible to optimize for performance or for power savings as needed. ​
    • NEW: Default Inference Precision - no matter which device you use, OpenVINO will default to the format that enables its optimal performance. For example, FP16 for GPU or BF16 for 4th Generation Intel® Xeon®. You no longer need to convert the model beforehand to specific IR precision, and you still have the option of running in accuracy mode if needed.​ 
    • Model caching on GPU is now improved with more efficient model loading/compiling.

Support Change and Deprecation Notices

  • The following OpenVINO C++/C/Python 1.0 APIs are deprecated and will be removed in the 2023.1 release: 
    • NV12 and I420 color formats for legacy API  
    • Methods and functions that allow creating nv12 and i420 blobs 
    • InferRequest::SetBatch() method
    • InferRequest::SetBlob() method that allows setting pre-processing for a specific tensor in Infer request 
    • Legacy properties: DYN_MATCH_LIMIT and DYN_BATCH_ENABLED
  • Python 3.7 support is now deprecated and will be removed in the 2023.2 release. 
  • CPU property ov::affinity will be no longer available  beginning in the 2024.0 release. When thread scheduling, the type of core used in the CPU will be identified by OpenVINO Runtime automatically. New properties are provided in the 2023.0 release to set CPU thread configuration, including;  
    • ov::hint::enable_cpu_pinning for thread pinning
    • ov::hint::scheduling_core_type for the type of core you are selecting (P core or E core) 
    • ov::hint::enable_hyper_threading for hyper-threading control 
  • Post-training Optimization Tool (POT) is being deprecated as part of the 2023.0 release. Neural Network Compression Framework (NNCF) is now recommended for post-training optimization and quantization-aware training. 
  • Starting in 2024, using models trained in Caffe, Kaldi or MXNet will require conversion to ONNX or OpenVINO IR using version 2022.3 or 2023.2 LTS. 
  • All ONNX Frontend legacy API (decorated by ONNX_IMPORTER_API) will be deprecated in the 2023.1 release. The following symbols will no longer be available as part of the 2024.0 release: 
    • ngraph::onnx_import::import_onnx_model
    • ngraph::onnx_import::is_operator_supported and ngraph::onnx_import::get_supported_operators
    • ngraph::onnx_import::register_operator and ngraph::onnx_import::unregister_operator
    • ngraph::onnx_import::Node and ngraph::onnx_import::NullNode
    • ngraph::op::is_null
    • ov::onnx_editor::ONNXModelEditor
  • OpenVINO integration with TensorFlow will no longer be supported as part of this release. If you prefer to continue using the native framework API, you can consider using the Intel Extension for TensorFlow (ITEX). Another option is to utilize the OpenVINO Model Optimizer, which enables the conversion of standard TensorFlow models. Finally, in 2023.0 TensorFlow models are now auto-converted in OpenVINO during runtime, so you no longer need to convert your model offline. 
  • OpenVINO integration with TorchORT will no longer be supported as part of this release. If you prefer to continue using the native framework API, you can consider using the Intel Extension for PyTorch (IPEX). and if you’re looking for performance gains on Intel hardware using OpenVINO, use our new preview PyTorch auto-conversion capability. 
  • Python 3.7 will be deprecated as part of this release and no longer supported in 2023.2
  •  Deprecated the 'PerfomanceMode.UNDEFINED' property as part of the OpenVINO Python API which will be removed in the 2024.0 release. 
  • OpenVINO Runtime no longer supports the following devices: 
    • Intel® Neural Compute Stick 2 
    • Intel® Vision Accelerator Design with Intel® Movidius™ Vision Processing Units (HDDL) 
    • AI Edge Computing Board with Intel® Movidius™ Myriad™ X C0 VPU, MYDX x 1 
    • NOTE: OpenVINO 2022.3.1 LTS (coming soon) includes support for the devices listed above. Long Term Support (LTS) for OpenVINO 2022.3 ends in December 2024.  

  • Compile tool has been removed as part of this release, we recommend the approach of  manually adding preprocessing steps into model, saving to IR file is recommended. 

OpenVINO™ Development Tools

  • Included list of components and their changes: 
    • Coversion API (Model Optimizer)  
      • Model optimizer functionality is now referred to as Conversion API, it now automatically imports TensorFlow models as the default path for conversion to Intermediate Representation (IR). You will see a greatly improved conversion time for TensorFlow models. Known limitations are: TensorFlow 1 Loop, Complex types, models requiring config files, and old Python extensions. Model Optimizer detects unsupported functionalities and provides the fallback solution automatically. You can still specify the --use_legacy_frontend parameter to force the legacy frontend experience if needed. 
      • Model Optimizer now supports out-of-the-box conversion of TensorFlow 2 Object Detection models, so you can run them without config files. At this point, similar performance experience is seen only on CPU.  
      • Model Optimizer now supports:
        • Conversion directly from memory for PyTorch models 
        • Using the "input" parameter without node names for new Frontends 
        • Passing a saved model directory as an unnamed parameter 
      • Conversion API no longer outputs the mapping file. All original input/output names of a model are preserved in Intermediate Representation or IR file and can be retrieved via both API 1.0 and API 2.0. 
    • Neural Network Compression Framework (NNCF)
      • NNCF now enables you to perform post-training optimization, as well as quantization-aware training. Try it out: pip install nncf. 
      • There are two main flows for the 8-bit post-training quantization:
        • with nncf.quantize() API for the simplest and fastest quantization 
        • with nncf.quantize_with_accuracy_control() API for advanced quantization with accuracy control. Using this mechanism, you can define the accuracy-drop criteria for NNCF to consider during quantization. 
      • Automatic Joint Pruning Quantization and Distillation (JPQD) algorithm support is now implemented for BERT, Swin, ViT, DistilBERT, CLIP, and MobileBERT models. 
      • Initial support of post-training quantization for PyTorch models in The Default quantization method (MinMax plus Fast Bias Correction algorithms) is now available. 
      • Post-Training Optimization Tool (POT)  has been deprecated and will be removed in 2024.0 
    • Benchmark Tool enables you to estimate deep-learning inference performance on supported devices for both synchronous and asynchronous modes. 
    • Accuracy Checker is a deep learning accuracy validation tool that enables you to collect accuracy metrics against popular datasets
    • Annotation Converter is a utility that prepares datasets for evaluation with Accuracy Checker. 
    • Model Downloader and other Open Model Zoo tools have been moved to maintenance mode, as a source of models. 

OpenVINO™ Runtime (previously known as Inference Engine)

  • Overall updates 
    • Resizing input images in the preprocessor module now supports two modes from the Pillow library - BILINEAR and BICUBIC. 
    • OpenVINO's graph representation has been upgraded to opset11, introducing a new set of operations that offer enhanced functionality and optimizations. See the full list of operations here along with supported plugins.
  • OpenVINO Python API
    • Enhanced inference functionality with the 'shared_memory' attribute has been added as part of inference-related functions of CompiledModel, InferRequest, and AsyncInferQueue. When using the 'shared_memory' mode, it facilitates data sharing between the host and OpenVINO via the Tensor API, enabling "zero-copy" inputs that meet the necessary requirements.  Additionally, the 'shared_memory' attribute has been introduced to the OpenVINO Constant class, allowing for zero-copy memory operations. 
    • Direct creation of OpenVINO Constant and Tensor classes from numpy array scalars is now possible. 
    • Improved Model class by adding support to the deepcopy interface, providing greater flexibility in managing model instances. 
    • Expanded input types with ‘Core.read_model’ function now accepts new input types, including 'pathlib.Path' and 'io.BytesIO', enabling more convenient model loading. 
    • Enhanced runtime functionality by using ‘runtime.passes.Serialize’ function has been extended to include new types for arguments, enhancing serialization capabilities. 
    • The description of Python dependencies has been centralized and improved for easier maintenance. 
    • ONNX, the OpenVINO third-party dependency, has been upgraded to version 1.13.1, ensuring compatibility and leveraging the latest features. 
    • Additional operators, Interpolate and TopK have been added to the Python API, expanding the range of supported operations.
  • OpenVINO C API
    • C API 2.0 now includes support for the remote tensor, which can be found in graph memory for example.  
  • AUTO device plug-in (AUTO)
    • Now AUTO can leverage ov::device::properties, to configure the hardware device used in execution. For example, set the number of streams used by the CPU. 
    • AUTO uses the CPU as the initial device for inference when a different accelerator is selected as the target device. A new configuration option can be used to turn off the feature, to speed up first-inference latency: ov::intel_auto::enable_startup_fallback(false) 
    • AUTO now features automatic device selection and falls back on the CPU to allow continuous execution without additional errors.  
    • The behavior of MULTI is now fully aligned with AUTO using cumulative throughput performance hint. This removes support for two configuration options for MULTI: 1) Changing device priorities via CompiledModel::set_property() on the fly; 2) defining the number of requests to allocate for each device, such as  "MULTI:CPU(2),GPU(2)". The latter method is not performance-portable, and as such, not recommended. 
    • ov::hint::inference_precision enables running network inference independently of the IR precision. 
  • Intel® CPU 
    • The CPU device is enabled with BF16 data types, so that quantized models (INT8) can be run with BF16 plus INT8 mixed precision, taking full advantage of the AMX capability of 4th Generation Intel® Xeon® Scalable Processors (formerly Sapphire Rapids). The user sees the BF16/INT8 advantage, by default. 
    • Performance for Transformer models for NLP pipelines on CPU is now increased, especially for int8 models. 
    • Pre-built oneTBB 2021.2.2 is now implemented in OpenVINO for all platforms without the system-default TBB library, including Windows, CentOS7, and MacOS. The default TBB library is used when available, for example, in Ubuntu. The goal is to minimize disruption to the user’s system and maintain the highest backward compatibility. 
    • ARM processor support has expanded: 
      • Increased model coverage 
      • Dynamic shapes have been enabled 
      • Performance boosted for many models including BERT
      • Validated for Raspberry Pi 4 and Apple® Mac M1/M2 
  • Intel® GPU 
    • Dynamic shapes are now supported for GPU inference. This means that Natural Language Processing models now work on all the iGPU and dGPU platforms. More models will be supported in the future OpenVINO releases.
    • The model caching experience has been improved and expanded supported for inferencing on GPU. Additionally, it uses the same interface as for other devices and First Inference Latency (FIL) has been significantly reduced.
    • First Ever Inference Latency (FEIL) has been reduced for most models for GPU inference.
    • Peak memory consumption has been optimized, allowing significant memory usage reduction. 
    • Inference performance of transformer models, such as Bert-like or Vision-transformer, has been improved on dGPU platforms. 
    • Up to 8D tensors are now supported. 
  • Intel® Gaussian & Neural Accelerator (Intel® GNA) 
    • A bug that led to inaccurate results has been fixed for certain models produced by Post Training Optimization (POT).
  • Model Import Updates  
    • Intermediate Representation (IR) uses mmap reading by default, which enables reading only the necessary parts of the model when needed. This saves memory when working with large models, however, if the model files are not easily accessible, it may slow down the first time a model is used. In such a case, you can disable it by using ov::enable_mmap in the core property to fall back on the original way of reading the model. 
    • FrontEndManager register_front_end (name, lib_path) interface is added, to remove “OV_FRONTEND_PATH” env var (a way to load non-default frontends). 
    • PyTorch users can now convert_model Python API directly from your code, without the need to export to the ONNX format as part of a new preview feature.  
    • TensorFlow - Now standard TensorFlow models are imported directly to OpenVINO Runtime and OpenVINO Model Server. Models are converted automatically. 
      • New model formats supported for read_model case: SavedModel (.pb), MetaGraph (.meta), and text Protobuf format (.pbtxt) for frozen models. 
      • TensorFlow 2 body-graph operations (While, If, PartitionedCall), TensorList* operations (TensorListFromTensor, TensorListGetItem, TensorListSetItem, TensorListStack, TensorListReserve), and RNN operations are now supported. 
      • Auto-pruning functionality for Iterator, IteratorGetNext, Queue, and Lookup operations have been implemented. 
      • Conversion extensions with named output ports have been introduced (mainly needed for TensorFlow 2 models conversion). 
      • In the SavedModel format, output tensor names are fixed, and single names are used for inputs and outputs of the model. 
      • universal-sentence-encoder-multilingual model using extensions published in openvino_contrib is now supported. 
    • TensorFlow Lite users are now able to load models directly via “read_model” into OpenVINO Runtime and export to the OpenVINO format or Intermediate Representation using model conversion API (Model Optimizer) or “convert_model.” 
    • ONNX model support updates include the following operators 
      • AdaptivePool2d (from custom PyTorch opset)  
      • DFT-17  
      • Unique-11 
      • STFT-17
      • TopK-11 support has been extended to align with its specification by the ONNX opset.
      • Models that are not topologically sorted are now supported. 
      • ONNX (the OpenVINO third-party dependency) has been upgraded to version 1.13.1 
    • Paddle support updates include

      • Paddle framework support has been moved to version 2.4. 
      • Paddle quantized models from Paddle Slim, such as mobilenet-v1, resnet-50, yolov-5s, and ernie 3.0 are now supported. 

Distribution (where to download the release)

The OpenVINO product selector tool (available at provides the easiest access to the right packages that match your desired tools, OS, version, and distribution options. 

OpenVINO Model Server

OpenVINO Model Server (OVMS) is a solution for serving models. The tool uses the same API endpoints as TensorFlow Serving and KServe while leveraging OpenVINO for inference execution. See the full release notes on GitHub.  

  • Now you can submit inference requests and read the response in the form of strings. You can  do that using custom nodes and OpenVINO models with a CPU extension handling string data: 
    • Using a custom node in a DAG pipeline that can perform string tokenization before passing it to the OpenVINO model - this may be beneficial for models without the tokenization layer to fully delegate that preprocessing to the model server. 
    • Using a custom node in a DAG pipeline that can perform string detokenization of the model response to convert it to the string format - this may be beneficial for models without the detokenization layer to fully delegate that postprocessing to the model server. 
    • For models with the tokenization layer, like MUSE, there is a CPU extension added that implements the sentence_piece_tokenization layer. Users can pass to the model a string which is automatically converted to the format required by the CPU extension. 
    • The first two options here are demonstrated with a GPT model for text generation, while the third option is demonstrated in the MUSE model usage demo. 
  • Preview version of OVMS with MediaPipe framework enables to use OVMS calls to perform mediapipe graph processing. Calculators performing OpenVINO inference via CAPI calls from OVMS, as well as calculators converting the OV::Tensor input format to mediapipe image format have been created, building a foundation for creating arbitrary graphs. 
  • CAPI interface has been extended with ApiVersion and Metadata calls. 
  • The TensorFlow saved_model format is now supported. 
  • KServe API unification with Triton implementation for handling string and encoded image formats has been introduced. 
  • The default performance hint is now LATENCY. 
  • Memory handling after unloading the models has been improved.  
  • Relative paths to model files are now supported (relative to the config.json location).  

Learn more about the changes in 

OpenVINO Ecosystem


Improve performance of sparse Transformer models  


Deploy a model server and request predictions from a client application  


Improve performance of image preprocessing step  


Partition an audio stream containing human speech into homogeneous segments according to the identity of each speaker.  


Perform grammatical error correction using OpenVINO  


Perform Zero-shot Image Classification with CLIP and OpenVINO  


Sequence Classification with OpenVINO  


Optimize YOLOv8 using OpenVINO and NNCF Post-training Quantization API  


Image editing with InstructPix2Pix using OpenVINO  


Language-Visual Saliency with CLIP and OpenVINO  


Visual Question Answering and Image Captioning using BLIP and OpenVINO  


Audio compression with EnCodec and OpenVINO  


Use ControlNet and Stable Diffusion for Image Generation with OpenVINO  


Text-to-Image Generation and Infinite Zoom with Stable Diffusion v2 and OpenVINO


Prompt based object segmentation mask generation using Segment Anything and OpenVINO


Person tracking with a webcam or video file 

Known Issues


Jira ID






Performance and memory consumption may not be optimal if layers are not 64-bytes aligned. 

GNA plugin

Try to avoid the layers which are not 64-bytes
aligned to make a model GNA-friendly. 




 Bug in passing numpy arrays constructed frombuffer

IE Python

arr1 = np.frombuffer(bytearray(b'\x01\x02\x03\x04'), np.uint8)




For 3rd generation of Intel Xeon Platform (codename: ICX) in duo-socket based hosts, using all CPU cores for single network inference with "latency hint" may observe higher latency in models comparing with OpenVINO 2022.3 release, on Ubuntu 22.04 system with default oneTBB libraries.

CPU Plugin

Ubuntu 22.04 is missing the TBBBind library in the system-default as part of the oneTBB installation. This will introduce different underlying inference streams (nstreams) and threads per stream (nthreads) between 2022.3 and 2023.0 release when using the "latency" performance hint. If the desired use case is to leverage all the CPU cores in the dual-socket host for running inference on one network, you need to manually set the "-hint none -nstreams 1" to bring back the latency performance. 



For 12th generation of Intel Core Platform (codename: ADL), there will be quantized (INT8) networks experiencing higher latency number when using "latency" performance hint.

CPU Plugin

When fusing compute Ops together to improve efficiency, a performance issue was observed in underlying libraries (oneDNN) to fuse per-tensor quantization as destination scale on AVX2-VNNI platform. Solution will be provided once oneDNN dependency is resolved. 
5 76190 ChannelAlignment support request     NNCF

Use POT to get more accurate results for MobileNet models on non-VNNI hardware.

6 112503 The following models will not work with PyTorch 2.0.0+: nfnet-f0, pspnet-pytorch, text-recognition-resnet-fc. Open Model Zoo You need to lower PyTorch version to convert these specific models.
7 112712

Issues occurred in POT usage on Windows platforms. The problem was caused by accessing the same files in read and write modes. This resulted in errors on Windows due to the default usage of the MMap allocator (enabled in 2023.0). Corresponding fixes were implemented in POT and the accuracy checker to disable the MMap in OpenVINO Core. However, users may still encounter similar problems in their applications when simultaneously reading/writing to the same file. 


Follow this guide to disable the default switch to the MMap allocator. The issue is fixed for POT quantization on Windows, see the PR in the OpenVINO repository for more information.
8 113384 Due to OS specifics, it is not possible to properly handle directory in read_model() on Windows. This results in a failure to directly import a TensorFlow saved_model directory using the read_model() call. 

OpenVINO Core

The issue is fixed in the following PR in the OpenVINO repository.

System Requirements

Disclaimer. Certain hardware (including but not limited to GPU and GNA) requires manual installation of specific drivers to work correctly. Drivers might require updates to your operating system, including Linux kernel, please refer to their documentation. Operating system updates should be handled by user and are not part of OpenVINO installation. 

Intel CPU processors with corresponding operating systems 

Intel Atom® processor with Intel® SSE4.2 support 

Intel® Pentium® processor N4200/5, N3350/5, N3450/5 with Intel® HD Graphics 

6th - 13th generation Intel® Core™ processors 

Intel® Xeon® Scalable Processors (code name Skylake) 

2nd Generation Intel® Xeon® Scalable Processors (code name Cascade Lake) 

3rd Generation Intel® Xeon® Scalable Processors (code name Cooper Lake and Ice Lake) 

4th Generation Intel® Xeon® Scalable Processors (code name Sapphire Rapids) 

Operating Systems:

  • Ubuntu 22.04 long-term support (LTS), 64-bit (Kernel 5.15+)
  • Ubuntu 20.04 long-term support (LTS), 64-bit (Kernel 5.15+)
  • Ubuntu 18.04 long-term support (LTS) with limitations, 64-bit (Kernel 5.4+)
  • Windows* 10 
  • Windows* 11 
  • macOS* 10.15 and above, 64-bit 
  • Red Hat Enterprise Linux* 8, 64-bit

Intel® Processor Graphics with corresponding operating systems (GEN Graphics) 

Intel® HD Graphics 

Intel® UHD Graphics 

Intel® Iris® Pro Graphics 

Intel® Iris® Xe Graphics 

Intel® Iris® Xe Max Graphics 

Intel® Arc ™ GPU Series 

Intel® Data Center GPU Flex Series  

Operating Systems:

  • Ubuntu* 22.04 long-term support (LTS), 64-bit
  • Ubuntu* 20.04 long-term support (LTS), 64-bit
  • Windows* 10, 64-bit 
  • Windows* 11, 64-bit
  • Red Hat Enterprise Linux* 8, 64-bit


  • This installation requires drivers that are not included in the Intel® Distribution of OpenVINO™ toolkit package. 
  • A chipset that supports processor graphics is required for Intel® Xeon® processors. Processor graphics are not included in all processors. See  Product Specifications for information about your processor. 
  • Although this release works with Ubuntu 20.04 for discrete graphic cards, Ubuntu 20.04 is not POR for discrete graphics drivers, so OpenVINO support is limited. 
  • Recommended OpenCL™ driver's versions:  22.43 for Ubuntu* 22.04, 22.41 for Ubuntu* 20.04 and 22.28 for Red Hat Enterprise Linux* 8 

Intel® Gaussian & Neural Accelerator 

Operating Systems:

  • Ubuntu* 22.04 long-term support (LTS), 64-bit
  • Ubuntu* 20.04 long-term support (LTS), 64-bit
  • Windows* 10, 64-bit 
  • Windows* 11, 64-bit


Operating system's and developer's environment requirements:

  • Linux* OS
    • Ubuntu 22.04 with Linux kernel 5.15+
    • Ubuntu 20.04 with Linux kernel 5.15+
    • RHEL 8 with Linux kernel 5.4
    • A Linux OS build environment needs these components:
    • Higher versions of kernel might be required for 10th Gen Intel® Core™ Processor, 11th Gen Intel® Core™ Processors, 11th Gen Intel® Core™ Processors S-Series Processors, 12th Gen Intel® Core™ Processors, 13th Gen Intel® Core™ Processors,  or 4th Gen Intel® Xeon® Scalable Processors to support CPU, GPU, GNA or hybrid-cores CPU capabilities 
  • Windows* 10 and 11 
  • macOS* 10.15 and above 
  • DL frameworks versions:
    • TensorFlow* 1.15, 2.12
    • MxNet* 1.9
    • ONNX* 1.13
    • PaddlePaddle* 2.4
    • Note: This package can be installed on other versions of DL Framework but only the specified version above are fully validated.

NOTE: OpenVINO Python binaries and binaries on Windows/CentOS7/MACOS(x86) are built with oneTBB libraries, and others on Ubuntu and Redhat OS systems are built with legacy TBB which is released by OS distribution. OpenVINO supports being built with oneTBB or legacy TBB by a user on all above OS systems. System compatibility and performance were improved on Hybrid CPUs like 12th Gen Intel Core and above.

Included in This Release

The Intel® Distribution of OpenVINO™ toolkit is available for downloading in three types of operating systems: Windows*, Linux*, and macOS*. 







OpenVINO (Inference Engine) C++ Runtime

Unified API to integrate the inference with application logic

OpenVINO (Inference Engine) Headers

Dual licensing:

Intel® OpenVINO™ Distribution License (Version May 2021)

Apache 2.0








OpenVINO (Inference Engine) Pythion API

Apache 2.0





OpenVINO (Inference Engine) Samples

Samples that illustrate OpenVINO C++/ Python API usage

Apache 2.0





Deployment manager

The Deployment Manager is a Python* command-line tool that creates a deployment package by assembling the model, IR files, your application, and associated dependencies into a runtime package for your target device.

Apache 2.0






Helpful Links

NOTE: Links open in a new window.

Home Page

Featured Documentation

All Documentation, Guides, and Resources

Community Forum

Legal Information 

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. 

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. 

All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. 

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. 

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at or from the OEM or retailer. 

No computer system can be absolutely secure. 

Intel, Atom, Arria, Core, Movidius, Xeon, OpenVINO, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. 

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos 

*Other names and brands may be claimed as the property of others. 

Copyright © 2023, Intel Corporation. All rights reserved. 

For more complete information about compiler optimizations, see our Optimization Notice.