Accelerate Deep Learning with Intel® Extension for TensorFlow*
Subscribe Now
Stay in the know on all things CODE. Updates are delivered to your inbox.
Overview
Intel and Google* have been collaborating to deliver optimized machine learning implementations of compute-intensive TensorFlow* operations. For example, convolution filters that require large matrix multiplications.
In this session, Penporn Koanantakook of Google delivers an overview of the Intel and Google collaboration, which includes the Intel® Extension for TensorFlow* and other key AI developer tools—Intel® oneAPI Deep Neural Network Library (oneDNN) and Intel® Neural Compressor.
This session covers:
- Optimizations that have been implemented, such as operation fusion, primitive caching, and vectorization of int8 and bfloat16 data types.
- A live demonstration of the Intel Neural Compressor automatically quantizing a network to improve performance by 4x with a 0.06% accuracy loss.
- An overview of the PluggableDevice mechanism in TensorFlow, co-architected by Intel and Google to deliver a scalable way for developers to add new device support as plug-in packages.
Note This presentation was current as of TensorFlow v2.8. Starting with TensorFlow v2.9, the oneDNN optimizations are on by default, and no longer require the TF_ENABLE_ONEDNN_OPTS=1 variable setting.
Featured Software
Get all of the following as stand-alone products or as part of AI Tools:
- oneDNN: An open source, cross-platform library that provides implementations of deep learning building blocks that use the same API for CPUs, GPUs, or both.
- Intel Extension for TensorFlow: An end-to-end, open source, machine learning platform.
- Intel Neural Compressor: A unified, low-precision inference interface across multiple deep learning frameworks.
Accelerate data science and AI pipelines-from preprocessing through machine learning-and provide interoperability for efficient model development.
Improve deep learning (DL) application and framework performance on CPUs and GPUs with highly optimized implementations of DL building blocks.
Speed up AI inference without sacrificing accuracy with this open source Python* library that automates popular model compression technologies.