Goodbye, Slow Inference Workloads. Hello, Improved Quantization Techniques.

@IntelDevTools

 

 

Deep-learning deployment on the edge for real-time inference can significantly reduce the cost of communicating with the cloud in terms of network bandwidth, network latency, and power consumption.

But there’s a flip side: Edge devices have limited memory, compute, and power. As a result, using the traditional 32 bits of floating-point precision is often too computationally heavy for embedded deep learning inference workloads.

The Intel® Distribution of OpenVINO™ toolkit offers a solution via int8 quantization—deep learning inference with 8-bit multipliers.

Join deep-learning expert Alex Kozlov for a closer look at achieving better performance with less overhead on Intel® CPUs, GPUs, and VPUs using the latest int8 calibration tool and runtime in the OpenVINO toolkit. He covers:

  • New features such as asymmetric quantization, bias correction, and weight equalization to improve quality of inference workloads and lower precision
  • How to make best use of enhanced capabilities in the OpenVINO toolkit for your AI applications
  • Using int8 to accelerate computation performance, save memory bandwidth and power, and provide better cache locality

Get the Software

Download the latest version of the Intel® Distribution of OpenVINO™ toolkit so you can follow along during the webinar.


Other Resources


Alexander Kozlov

Machine-learning and deep-learning R&D engineer, Intel Corporation

Alexander has expertise in deep-learning object detection architectures, human action recognition approaches, and neural network compression techniques. Before Intel, he was a senior software engineer and researcher at Itseez* (now acquired by Intel) where he worked on computer-vision algorithms for advanced drive-assistance systems (ADAS). Now Alexander focuses on deep learning neural network compression methods and tools that allow getting more lightweight and hardware-friendly models. Alex holds a master’s degree from the University of Nizhny Novgorod.

 

 

Intel® Distribution of OpenVINO™ Toolkit

Deploy deep learning inference with unified programming models and broad support for trained neural networks from popular deep learning frameworks.

Get It Now   

See All Tools