Intel® Deep Learning Boost (Intel® DL Boost)

The second generation of Intel® Xeon® Scalable processors introduced a collection of features for deep learning, packaged together as Intel® Deep Learning Boost. These features include Vector Neural Network Instructions (VNNI), which increases throughput for inference applications with support for INT8 convolutions by combining multiple machine instructions from previous generations into one machine instruction.

First MLPerf Inference Results

Technical Description of VNNI

Frameworks and Tools

These frameworks and tools include support for Intel DL Boost on second and third generation Intel Xeon Scalable processors.

Model Quantization

Most deep learning models are built using 32 bits floating-point precision (FP32). Quantization is the process to represent the model using less memory with minimal accuracy loss. In this context, the main focus is the representation in INT8.