Release Notes for Intel Distribution of OpenVINO Toolkit 2023.1

New and Changed in 2023.1

Summary of major features and improvements 

More Generative AI options with Hugging Face and improved PyTorch model support.

NEW: Your PyTorch solutions are now even further enhanced with OpenVINO. You’ve got more options and you no longer need to convert to ONNX for deployment. Developers can now use their API of choice - PyTorch or OpenVINO for added performance benefits. Additionally, users can automatically import and convert PyTorch models for quicker deployment. You can continue to make the most of OpenVINO tools for advanced model compression and deployment advantages, ensuring flexibility and a range of options.
torch.compile (preview) – OpenVINO is now available as a backend through PyTorch torch.compile, empowering developers to utilize OpenVINO toolkit through PyTorch APIs. This feature has also been integrated into the Automatic1111 Stable Diffusion Web UI, helping developers achieve accelerated performance for Stable Diffusion 1.5 and 2.1 on Intel CPUs and GPUs in both Native Linux and Windows OS platforms.
Optimum Intel – Hugging Face and Intel continue to enhance top generative AI models by optimizing execution, making your models run faster and more efficiently on both CPU and GPU. OpenVINO serves as a runtime for inferencing execution. New PyTorch auto import and conversion capabilities have been enabled, along with support for weights compression to achieve further performance gains.

Broader LLM model support and more model compression techniques

Enhanced performance and accessibility for Generative AI: Runtime performance and memory usage have been significantly optimized, especially for Large Language models (LLMs). Models used for chatbots, instruction following, code generation, and many more, including prominent models like BLOOM, Dolly, Llama 2, GPT-J, GPTNeoX, ChatGLM, and Open-Llama have been enabled.
Improved LLMs on GPU – Model coverage for dynamic shapes support has been expanded, further helping the performance of generative AI workloads on both integrated and discrete GPUs. Furthermore, memory reuse and weight memory consumption for dynamic shapes have been improved.

Neural Network Compression Framework (NNCF) now includes an 8-bit weights compression method, making it easier to compress and optimize LLM models. SmoothQuant method has been added for more accurate and efficient post-training quantization for Transformer-based models.

More portability and performance to run AI at the edge, in the cloud or locally.

NEW: Support for Intel® Core™ Ultra (codename Meteor Lake). This new generation of Intel CPUs is tailored to excel in AI workloads with a built-in inference accelerators.
Integration with MediaPipe – Developers now have direct access to this framework for building multipurpose AI pipelines. Easily integrate with OpenVINO Runtime and OpenVINO Model Server to enhance performance for faster AI model execution. You also benefit from seamless model management and version control, as well as custom logic integration with additional calculators and graphs for tailored AI solutions. Lastly, you can scale faster by delegating deployment to remote hosts via gRPC/REST interfaces for distributed processing.

Support Change and Deprecation Notices

OpenVINO™ Development Tools (pip install openvino-dev) are currently being deprecated and will be removed from installation option and distribution channels with 2025.0.
Tools:
- Accuracy Checker is deprecated and will be discontinued with 2024.0.  
- Post-Training Optimization Tool (POT)  has been deprecated and will be discontinued with 2024.0.

Runtime:
- Intel® Gaussian & Neural Accelerator (Intel® GNA) is being deprecated, the GNA plugin will be discontinued with 2024.0.
- The shared_memory argument for Python API inference methods is deprecated and replaced by a new share_inputs argument.
- OpenVINO C++/C/Python 1.0 APIs will be discontinued with 2024.0.
- Python 3.7 will be discontinued with 2023.2 release.

OpenVINO™ Development Tools

List of components and their changes:
- A preview of the new OpenVINO converter tool (OVC) has been introduced. This tool offers functionality similar to Model Optimizer and is designed to be its lightweight version with the following differences:
  - Pre-processing options like layout, channel reverse, mean and scale are supposed to be applied through preprocess API and not supported in OVC.
  - The model file is specified without input_model parameter, and the framework is detected automatically.
- Conversion API (Model Optimizer)  
  - convert_model Python API is now available in the openvino namespace.
  - Model Optimizer tool generates an Intermediate Representation or IR file with compressed weights by default. --compresss_to_fp16 option can be used to control this behavior. convert_model keeps original weights for generated OpenVINO Model object.
- Neural Network Compression Framework (NNCF)
  - Added SmoothQuant method for more accurate Post-training Quantization of Transformer-based models.
  - Introduced new API nncf.compress_weights() and preliminary support for 8-bit weights compression method for OpenVINO and PyTorch LLMs.
  - Added Hyperparameters Tuning method into Post-training Quantization. When enabled, it automatically finds hyperparameters for the most efficient quantization results.
  - Extended Post-training Quantization for OpenVINO by ChannelAlignment algorithm for more accurate quantization results.
  - Extended Post-training Quantization for PyTorch by Fast Bias Correction algorithm for more accurate quantization results. For more details, refer to NNCF Release Notes.
- Benchmark Tool enables you to estimate deep-learning inference performance on supported devices for both synchronous and asynchronous modes.

OpenVINO™ Runtime (previously known as Inference Engine)

Overall updates
- Proxy & hetero plugins have been migrated to API 2.0, providing enhanced compatibility and stability.
- Symbolic shape inference preview is now available, leading to improved performance for LLMs.
- OpenVINO's graph representation has been upgraded to opset12, introducing a new set of operations that offer enhanced functionality and optimizations.
OpenVINO Python API
- Python Conversion API is now the primary conversion path, making it easier for Python developers to work with OpenVINO.
- Python API inference methods (like InferRequest.infer, CompiledModel.__call__) have new parameters called share_inputs and share_outputs which allow to control memory sharing on inputs and outputs of inference. Enabling shared memory modes resulting in a “zero-copy” approach, which reduces memory consumption and compute overhead by reducing the number of copies.
- The torchvision.transforms object has been added to OpenVINO pre-processing, which allows user to embed torchvision pre-processing into IR.
- All python tools related to OpenVINO have been moved into a single namespace, improving user experience with better API readability.
OpenVINO C API
- The following C API 1.0 is removed from the 2023.1 release, as communicated in the 2023.0 release:
  - NV12 and I420 color formats for legacy API
  - Methods and functions enabling nv12 and i420 blobs creation
  - InferRequest::SetBatch() method
  - InferRequest::SetBlob() method which allows to set pre-processing for specific tensor in Infer request
  - Legacy properties: DYN_MATCH_LIMIT and DYN_BATCH_ENABLED
- The legacy C API is deprecated and will be removed in the 2024.0 release. Here are the instructions on transitioning to the new C API.
AUTO device plug-in (AUTO)
- Improved support of configuration through AUTO for the execution hardware devices, such as CPU or GPU, by leveraging ov::device::properties.
Intel® CPU
- Enabled weights decompression support for LLMs. This implementation supports avx2 and avx512 HW targets for Intel® Core™ processors for improved latency mode (FP32 VS FP32+INT8 weights comparison). For 4th Generation Intel® Xeon® Scalable Processors (codename Sapphire Rapids) this INT8 decompression feature provides performance improvement, compared to pure BF16 inference.
- Improved performance of LLMs, particularly for Transformer-based models, in CPU plugin. By optimizing memory efficiency for output data between CPU plugin and the inference request, it improves performance of matrix multiplication operator, and other operators like LayerNorm and MVN.
- Reduced memory consumption for LLMs, especially for models with 7 billion parameters or above.
- Reduced overall model load and compile time, particularly for LLMs.
- Added new capability to provide full control for developers to tune the usage of the CPU cores for inference, including P-core, E-core, and hyper-threading cores based on supported CPU platforms.
Intel® GPU
- Improved performance of generative AI models. Inference performance of LLMs is significantly improved on both iGPU and dGPU. Stable Diffusion performance is also improved on iGPU.
- Dynamic shape support is expanded to convolutional models by enabling more operators to support dynamism. Performance of dynamic shapes has been also improved for NLP models.
- Improved performance for StableDiffusion models.
- Performance of transformer models such as Bert-like models and vision transformer models is improved on dGPU with integration of oneDNN 3.2.
Intel® Gaussian & Neural Accelerator (Intel® GNA)
- Introduced support for GNA 3.5, new version of GNA HW included in Intel® CoreTM Ultra (codename Meteor Lake).
- C++ and Python automatic speech recognition samples are being deprecated and will be removed with 2024.0.
- Optimized data pre-processing using AVX instructions on CPU has improved the runtime performance.
- Introduced support of automatic layout conversion which allows to support for more TensorFlow models out of the box.
Model Import Updates
- TensorFlow Framework Support
  - Added support for Switch/Merge operations, bringing TensorFlow 1.x control flow support closer to full compatibility and enabling more models.
  - Added support for the TensorFlow 1 Checkpoint format. All native TensorFlow formats are now enabled.
  - Added support for 12 new operations:
    - UnsortedSegmentSum
    - FakeQuantWithMinMaxArgs
    - MaxPoolWithArgmax
    - UnravelIndex
    - AdjustContrastv2
    - InvertPermutation
    - CheckNumerics
    - DivNoNan
    - EnsureShape
    - ShapeN
    - Switch
    - Merge
- PyTorch Framework Support
  - OpenVINO now supports originally quantized PyTorch models, including models produced with the Neural Network Compression Framework (NNCF).
  - Added support for inplace operations on aliases of tensors improving accuracy for detection models.
  - Added support for 43 new operations. To learn about PyTorch model conversion, follow this link.
    - aten::concat
    - aten::masked_scatter
    - aten::linspace
    - aten::view_as
    - aten::std
    - aten::outer
    - aten::broadcast_to
    - aten::allaten::embedding_bag
    - aten::argmax
    - aten::argmin
    - aten::unflatten
    - aten::item
    - aten::frobenius_norm
    - aten::__range_length
    - aten::__derive_index
    - aten::cdist
    - aten::pairwise_distance
    - aten::squeeze
    - aten::LogSoftmax
    - aten::_native_multi_head_attention
    - aten::_shape_as_tensor
    - aten::t
    - aten::fft_rfftn
    - aten::fft_irfft
    - aten::padaten::reflection_pad2d
    - aten::fake_quantize_per_tensor_affine
    - aten::fake_quantize_per_channel_affine
    - aten::scatter
    - aten::quantize_per_tensor
    - aten::quantize_per_channel
    - aten::dequantize
    - aten::quantize_per_channel
    - aten::rand
    - aten::randn
    - aten::rand_like
    - aten::randn_like
    - aten::broadcast_to
    - aten::_upsample_bilinear2d_aa
    - aten::_upsample_bicubic2d_aa
    - aten::randint
    - aten::index_put_
    - aten::tensor_split

Distribution (where to download the release)

The OpenVINO product selector tool (available at www.openvino.ai) provides easy access to the right packages that match your desired needs; OS, version, and distribution options. 

Added conda-forge pre-release channel, simplifying OpenVINO pre-release installation with “conda install -c “conda-forge/label/openvino_dev” openvino” command.
Python API is now distributed as a part of conda-forge distribution, allowing users to access it using the command above.
Runtime can now be installed and used via vcpkg C++ package manager, providing more flexibility in integrating OpenVINO into projects.
The 2023.1 release is available via the following distribution channels: 
- pypi.org: https://pypi.org/project/openvino-dev/ 
- DockerHub* https://hub.docker.com/u/openvino 
- Release Archives specifically for C++ users can be found here: https://storage.openvinotoolkit.org/repositories/openvino/packages/ 
- APT & YUM
- Homebrew https://formulae.brew.sh/formula/openvino 
- A new distribution channel has been introduced for C++ developers: Conda Forge.

OpenVINO Ecosystem

OpenVINO Model Server

OpenVINO Model Server (OVMS) is a solution for serving models. The tool uses the same API endpoints as TensorFlow Serving and KServe while leveraging OpenVINO for inference execution. See the full release notes on GitHub.  

Learn more about the changes in https://github.com/openvinotoolkit/model_server/releases

Open Model Zoo

The following public models are deprecated and will be removed in the 2023.2 release:

All Caffe models
- alexnet
- caffenet
- densenet-121
- face-detection-retail-0044
- googlenet-v1
- googlenet-v2
- mobilenet-ssd
- mobilenet-v1-1.0-224
- mobilenet-v2
- mtcnn
- pelee-coco
- se-inception
- se-resnet-50
- se-resnext-50
- shufflenet-v2-x0.5
- Sphereface
- squeezenet1.0
- squeezenet1.1
- ssd300
- ssd512
- vgg16
- vgg19
All MXNet models
- brain-tumor-segmentation-0001
- mobilefacedet-v1-mxnet
- octave-resnet-26-0.25
All Paddle models
- mobilenet-v3-large-1.0-224-paddle
- mobilenet-v3-small-1.0-224-paddle
- ocrnet-hrnet-w48-paddle
DeblurGAN-v2

Jupyter Notebook Tutorials

Since the 2023.0 release, the following new notebooks have been added:  

121-convert-to-openvino : Learn OpenVINO model conversion API
122-quantizing-model-with-accuracy-control : Quantizing with Accuracy Control using NNCF
220-books-alignment-labse - Cross-lingual Books Alignment with Transformers
241-riffusion-text-to-music - Text-to-Music generation using Riffusion

242-freevc-voice-conversion - High-Quality Text-Free One-Shot Voice Conversion with FreeVC
243-tflite-selfie-segmentation - Selfie Segmentation using TFLite
244-named-entity-recognition : Named entity recognition with OpenVINO™
245-typo-detector : English Typo Detection in sentences with OpenVINO™
246-depth-estimation-videpth : Monocular Visual-Inertial Depth Estimation with OpenVINO™

247-code-language-id : Identify the programming language used in an arbitrary code snippet
248-stable-diffusion-xl : Image generation with Stable Diffusion XL
249-oneformer-segmentation : Universal segmentation with OneFormer
250-music-generation : Text-to-Music generation using MusicGen
251-tiny-sd-image-generation : Image generation with TinySD

252-fastcomposer-image-generation: Fastcomposer – personalized image generation without model fine-tuning
253-zeroscope-text2video: Text-to-Video generation using ZeroScope

Added tutorials for 8-bit quantization support for the following notebooks:

227-whisper-subtitles-generation : Generate subtitles for video with OpenAI Whisper and OpenVINO
228-clip-zero-shot-image-classification : Zero-shot Image Classification with OpenAI CLIP

237-segment-anything : Object masks from prompts with SAM and OpenVINO
239-image-bind : Binding multimodal data using ImageBind and OpenVINO™

Known Issues

	Jira ID	Description	Component	Workaround
1	118179	When inputs byte sizes are matching, inference methods accept incorrect inputs in copy mode (share_inputs=False). Example: [1, 4, 512, 512] is allowed when [1, 512, 512, 4] is required by the model.	Python API, Plugins	Pass inputs which shape and layout match model ones.
2	119142	Reading TensorFlow model directly from memory using convert_model causes unpredicted behavior such as inference results mismatch, wrong output names and degraded performance	Conversion API, TensorFlow FE	Serialize TensorFlow model to the file and only then pass it to convert_model

System Requirements

Disclaimer. Certain hardware (including but not limited to GPU, GNA, and latest CPUs) requires manual installation of specific drivers and/or other software components to work correctly and/or to utilize hardware capabilities at their best. This might require updates to operating system, including but not limited to Linux kernel, please refer to their documentation for details. These modifications should be handled by user and are not part of OpenVINO installation. 

Intel CPU processors with corresponding operating systems 

Intel Atom® processor with Intel® SSE4.2 support 

Intel® Pentium® processor N4200/5, N3350/5, N3450/5 with Intel® HD Graphics 

6th - 13th generation Intel® Core™ processors

Intel® Core™ Ultra (codename Meteor Lake)

Intel® Xeon® Scalable Processors (code name Skylake) 

2nd Generation Intel® Xeon® Scalable Processors (code name Cascade Lake) 

3rd Generation Intel® Xeon® Scalable Processors (code name Cooper Lake and Ice Lake) 

4th Generation Intel® Xeon® Scalable Processors (code name Sapphire Rapids) 

Operating Systems:

Ubuntu 22.04 long-term support (LTS), 64-bit (Kernel 5.15+)
Ubuntu 20.04 long-term support (LTS), 64-bit (Kernel 5.15+)
Ubuntu 18.04 long-term support (LTS) with limitations, 64-bit (Kernel 5.4+)
Windows* 10

Windows* 11 
macOS* 10.15 and above, 64-bit 
Red Hat Enterprise Linux* 8, 64-bit
CentOS 7

Intel® Processor Graphics with corresponding operating systems (GEN Graphics) 

Intel® HD Graphics 

Intel® UHD Graphics 

Intel® Iris® Pro Graphics 

Intel® Iris® Xe Graphics 

Intel® Iris® Xe Max Graphics 

Intel® Arc ™ GPU Series 

Intel® Data Center GPU Flex Series  

Operating Systems:

Ubuntu* 22.04 long-term support (LTS), 64-bit
Ubuntu* 20.04 long-term support (LTS), 64-bit
Windows* 10, 64-bit

Windows* 11, 64-bit
Red Hat Enterprise Linux* 8, 64-bit

NOTES:

This installation requires drivers that are not included in the Intel® Distribution of OpenVINO™ toolkit package. 
A chipset that supports processor graphics is required for Intel® Xeon® processors. Processor graphics are not included in all processors. See  Product Specifications for information about your processor.

Although this release works with Ubuntu 20.04 for discrete graphic cards, Ubuntu 20.04 is not POR for discrete graphics drivers, so OpenVINO support is limited. 
The following minimum (i.e., used for old hardware) OpenCL™ driver's versions were used during OpenVINO internal validation:  22.43 for Ubuntu* 22.04, 21.48 for Ubuntu* 20.04 and 21.49 for Red Hat Enterprise Linux* 8.

Intel® Gaussian & Neural Accelerator 

Operating Systems:

Ubuntu* 22.04 long-term support (LTS), 64-bit

Ubuntu* 20.04 long-term support (LTS), 64-bit
Windows* 10, 64-bit 
Windows* 11, 64-bit

Operating system's and developer's environment requirements:

Linux* OS
- Ubuntu 22.04 with Linux kernel 5.15+
- Ubuntu 20.04 with Linux kernel 5.15+
- RHEL 8 with Linux kernel 5.4
- A Linux OS build environment needs these components:
  - Python* 3.7-3.11
  - Intel® HD Graphics Driver. Required for inference on GPU.
  - Note: GNU Compiler Collection and CMake are needed for building from source:
    - GNU Compiler Collection (GCC)*  8.4 (RHEL 8) 9.3 (Ubuntu 20)
    - CMake * 3.10 or higher
- Higher versions of kernel might be required for 10th Gen Intel® Core™ Processors, 11th Gen Intel® Core™ Processors, 11th Gen Intel® Core™ Processors S-Series Processors, 12th Gen Intel® Core™ Processors, 13th Gen Intel® Core™ Processors, Intel® Core™ Ultra Processors, or 4th Gen Intel® Xeon® Scalable Processors to support CPU, GPU, GNA or hybrid-cores CPU capabilities.
Windows* 10 and 11
- A Windows* OS build environment needs these components:
- Microsoft Visual Studio* 2019
- CMake  3.14 or higher
- Python* 3.7-3.11
- Intel® HD Graphics Driver. Required only for GPU.
macOS* 10.15 and above 
- A macOS build environment requires these components:
  - Xcode * 10.3
  - Python 3.7-3.11
  - CMake  3.13 or higher
DL frameworks versions:
- TensorFlow* 1.15, 2.12
- MxNet* 1.9
- ONNX* 1.14
- PaddlePaddle* 2.4
- Note: This package can be installed on other versions of DL Framework but only the specified version above are fully validated.

NOTE: OpenVINO Python binaries and binaries on Windows, CentOS 7, macOS (x86) are built with oneTBB libraries, and others on Ubuntu and RedHat OS systems are built with legacy TBB which is released by OS distribution. OpenVINO supports being built with oneTBB or legacy TBB from source on all above OS systems. System compatibility and performance were improved on Hybrid CPUs like 12th Gen Intel Core and above.

Included in This Release

The Intel® Distribution of OpenVINO™ toolkit is available for downloading in three types of operating systems: Windows*, Linux*, and macOS*. 

Component	License	Location	Windows	Linux	macOS
OpenVINO (Inference Engine) C++ Runtime Unified API to integrate the inference with application logic OpenVINO (Inference Engine) Headers	Dual licensing: Intel® OpenVINO™ Distribution License (Version May 2021) Apache 2.0	<install_root>/runtime/*     <install_root>/runtime/include/*	Yes	Yes	Yes
OpenVINO (Inference Engine) Pythion API	Apache 2.0	<install_root>/python/*	Yes	Yes	Yes
OpenVINO (Inference Engine) Samples Samples that illustrate OpenVINO C++/ Python API usage	Apache 2.0	<install_root>/samples/*	Yes	Yes	Yes
Deployment manager The Deployment Manager is a Python* command-line tool that creates a deployment package by assembling the model, IR files, your application, and associated dependencies into a runtime package for your target device.	Apache 2.0	<install_root>/tools/deployment_manager/*	Yes	Yes	Yes

Helpful Links

NOTE: Links open in a new window.

Home Page

Featured Documentation

All Documentation, Guides, and Resources

Community Forum

Legal Information 

You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. 

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. 

All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. 

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. 

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Learn more at http://www.intel.com/ or from the OEM or retailer. 

No computer system can be absolutely secure. 

Intel, Atom, Arria, Core, Movidius, Xeon, OpenVINO, and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. 

OpenCL and the OpenCL logo are trademarks of Apple Inc. used by permission by Khronos 

*Other names and brands may be claimed as the property of others. 

For more complete information about compiler optimizations, see our Optimization Notice. 

Select Your Language

Using Intel.com Search

Quick Links

Recent Searches

Advanced Search

Only search in

Intel® Distribution of OpenVINO™ Toolkit Release Notes

New and Changed in 2023.1

Summary of major features and improvements

Support Change and Deprecation Notices

OpenVINO™ Development Tools

OpenVINO™ Runtime (previously known as Inference Engine)

Distribution (where to download the release)

OpenVINO Ecosystem

Jupyter Notebook Tutorials

Known Issues

System Requirements

Included in This Release

Helpful Links

Product and Performance Information