Deep-learning deployment on the edge for real-time inference can significantly reduce the cost of communicating with the cloud in terms of network bandwidth, network latency, and power consumption.

But, edge devices also have limited memory, compute, and power. As a result, using the traditional 32 bits of floating-point precision is often too computationally heavy for embedded deep learning inference workloads.

The Intel® Distribution of OpenVINO™ toolkit offers a solution via int8 quantization—deep learning inference with 8-bit multipliers.

Join deep-learning expert Alex Kozlov for a closer look at achieving better performance with less overhead on Intel® CPUs, GPUs, and VPUs using the latest int8 calibration tool and runtime in the Intel Distribution of OpenVINO toolkit. He covers:

  • New features such as asymmetric quantization, bias correction, and weight equalization to improve quality of inference workloads and lower precision
  • How to make best use of enhanced capabilities in the Intel Distribution of OpenVINO toolkit for your AI applications
  • Using int8 to accelerate computation performance, save memory bandwidth and power, and provide better cache locality

Get the Software

Download the latest version of the Intel® Distribution of OpenVINO™ toolkit so you can follow along during the webinar.


Other Resources


Alexander Kozlov

Machine-learning and deep-learning R&D engineer, Intel Corporation

Alexander has expertise in deep-learning object detection architectures, human action recognition approaches, and neural network compression techniques. Before Intel, he was a senior software engineer and researcher at Itseez* (now acquired by Intel) where he worked on computer-vision algorithms for advanced drive-assistance systems (ADAS). Now Alexander focuses on deep learning neural network compression methods and tools that allow getting more lightweight and hardware-friendly models. Alex holds a master’s degree from the University of Nizhny Novgorod.

 

 

Intel® Distribution of OpenVINO™ Toolkit

Deploy deep learning inference with unified programming models and broad support for trained neural networks from popular deep learning frameworks.

Get It Now   

See All Tools