Deep-learning deployment on the edge for real-time inference can significantly reduce the cost of communicating with the cloud in terms of network bandwidth, network latency, and power consumption.
But there’s a flip side: Edge devices have limited memory, compute, and power. As a result, using the traditional 32 bits of floating-point precision is often too computationally heavy for embedded deep learning inference workloads.
The Intel® Distribution of OpenVINO™ toolkit offers a solution via int8 quantization—deep learning inference with 8-bit multipliers.
Join deep-learning expert Alex Kozlov for a closer look at achieving better performance with less overhead on Intel® CPUs, GPUs, and VPUs using the latest int8 calibration tool and runtime in the OpenVINO toolkit. He covers:
- New features such as asymmetric quantization, bias correction, and weight equalization to improve quality of inference workloads and lower precision
- How to make best use of enhanced capabilities in the OpenVINO toolkit for your AI applications
- Using int8 to accelerate computation performance, save memory bandwidth and power, and provide better cache locality
Get the Software
Download the latest version of the Intel® Distribution of OpenVINO™ toolkit so you can follow along during the webinar.
- Webinar Slides
- Introducing Int8 Quantization for Fast CPU Inference Using the OpenVINO Toolkit
- Using Low-Precision, 8-bit Integer Inference
- Inference Flow with the Intel Distribution of OpenVINO Toolkit
- OpenVINO Toolkit: Example of an Int8 Full Inference Flow
Machine-learning and deep-learning R&D engineer, Intel Corporation
Alexander has expertise in deep-learning object detection architectures, human action recognition approaches, and neural network compression techniques. Before Intel, he was a senior software engineer and researcher at Itseez* (now acquired by Intel) where he worked on computer-vision algorithms for advanced drive-assistance systems (ADAS). Now Alexander focuses on deep learning neural network compression methods and tools that allow getting more lightweight and hardware-friendly models. Alex holds a master’s degree from the University of Nizhny Novgorod.