With advances in hardware acceleration and support for low precision, deep learning inference delivers higher throughput and lower latency. However, data scientists and AI developers often need to make a trade-off between accuracy and performance. There are also the deployment challenges due to high computational complexity of inference quantization. This webinar talks about the techniques and strategies, such as automatic accuracy-driven tuning for post-training quantization and quantization aware training, to overcome these challenges.
Join us to learn about Intel’s new low precision optimization tool and how it helped CERN OpenLab to reduce inference time while maintaining the same level of accuracy on convolutional Generative Adversarial Networks (GAN). The webinar gives insight about how to handle strict precision constraints that are inevitable while applying low precision computing to generative models.
AI and quantum researcher, CERN openlab
Sofia Vallecorsa is an accomplished physicist who specializes in scientific computing with commanding expertise in machine learning and deep learning architectures, frameworks, and methods for distributed training and hyper-parameters optimization. Joining CERN in 2015, she is responsible for several projects in machine learning and deep learning, quantum computing and quantum machine learning, and also supervises masters and doctoral thesis students in these same fields. Sofia holds a PhD in high-energy physics from University of Geneva.
Machine learning engineer, Intel Corporation
Feng is a senior deep learning engineer in machine learning performance team with IAGS (Intel Architecture, Graphic and Software) group. He leads the development of the Intel® Low Precision Optimization Tool and contributes on Intel-optimized deep learning frameworks, such as TensorFlow* and PyTorch*. He has 14 years of experience working on software optimization and low-level driver development on Intel® architecture platforms.