TensorFlow* Optimizations from Intel
Production Performance for AI and Machine Learning
Accelerate TensorFlow Training and Inference on Intel® Hardware
TensorFlow* is an open source AI and machine learning platform used widely for production AI development and deployment. Often these applications require deep neural networks and extremely large datasets, which can become compute bottlenecks.
Intel releases its newest optimizations and features in Intel® Extension for TensorFlow* before upstreaming them into open source TensorFlow.
With a few lines of code, you can extend TensorFlow to:
- Take advantage of the most up-to-date Intel software and hardware optimizations for TensorFlow.
- Speed up TensorFlow-based training and inference turnaround times on Intel hardware.
- Extend TensorFlow to further accelerate performance on Intel CPU and GPU hardware.
Intel also works closely with the open source TensorFlow project to optimize the TensorFlow framework for Intel hardware. These optimizations for TensorFlow, along with the extension, are part of the end-to-end suite of Intel® AI and machine learning development tools and resources.
Download the AI Tools
TensorFlow and Intel Extension for TensorFlow are available in the AI Tools Selector, which provides accelerated machine learning and data analytics pipelines with optimized deep learning frameworks and high-performing Python* libraries.
Download the Stand-Alone Version
Stand-alone versions of TensorFlow and Intel Extension for TensorFlow are available. You can install them using a package manager or build from the source.
Develop in the Cloud
Build and optimize oneAPI multiarchitecture applications using the latest Intel-optimized oneAPI and AI tools, and test your workloads across Intel® CPUs and GPUs. No hardware installations, software downloads, or configuration necessary.
Features
Open Source TensorFlow Powered by Optimizations from Intel
- Accelerate AI performance with Intel® oneAPI Deep Neural Network Library (oneDNN) features such as graph optimizations and memory pool allocation.
- Automatically use Intel® Deep Learning Boost instruction set features to parallelize and accelerate AI workloads.
- Reduce inference latency for models deployed using TensorFlow Serving.
- Starting with TensorFlow 2.9, take advantage of oneDNN optimizations automatically.
- Enable optimizations by setting the environment variable TF_ENABLE_ONEDNN_OPTS=1 in TensorFlow 2.5 through 2.8.
Intel Extension for TensorFlow
- Plug into TensorFlow 2.10 or later to accelerate training and inference on Intel GPU hardware with no code changes.
- Automatically mix precision using bfloat16 or float16 data types to reduce memory footprint and improve performance.
- Use TensorFloat-32 (TF32) math mode on Intel GPU hardware.
- Optimize CPU performance settings for latency or throughput using an autotuned CPU launcher.
- Perform more aggressive fusion through the oneDNN Graph API.
Optimized Deployment with OpenVINO™ Toolkit
- Import your TensorFlow model into OpenVINO™ Runtime and use the Neural Networks Compression Framework (NNCF) to compress model size and increase inference speed.
- Deploy with OpenVINO model server for optimized inference, accessed via the same API as TensorFlow Serving.
- Target a mix of Intel CPUs, GPUs (integrated or discrete), VPUs, or FPGAs.
- Deploy on-premise and on-device, in the browser, or in the cloud.
Benchmarks
Documentation & Code Samples
- TensorFlow Documentation
- Intel Extension for TensorFlow
- Get Started with TensorFlow in Docker* Containers:
Code Samples
- Get Started with TensorFlow
- Optimize a Pretrained Model for Inference
- Analyze TensorFlow Performance
- Train a BERT Model for Text Classification
- Speed Up Inference of Inception v4 by Advanced Automatic Mixed Precision
- Quantize Inception v3 by Intel Extension for TensorFlow on Intel® Xeon® Processors
- Perform Stable Diffusion Inference on Intel GPUs
- Accelerate ResNet-50 Training with XPUAutoShard on Intel GPUs
More Intel Extension for TensorFlow Samples
Demonstrations
Use Default Optimizations from Intel for TensorFlow
Use TensorFlow to apply transfer learning to an image classification model. Then see a demo highlighting improved AI performance with new Intel® Advanced Matrix Extensions (Intel® AMX) instructions on 4th gen Intel Xeon Scalable processors.
Accelerate TensorFlow with oneDNN
See the latest from the collaboration efforts between Google* and Intel to accelerate TensorFlow performance. This collaboration includes support for features such as int8 and bfloat16 vector and matrix extensions.
Get Better TensorFlow Performance on CPUs and GPUs
Learn about Intel Extension for TensorFlow, including the built-in optimizations and how to get started. Analyze performance bottlenecks by examining GPU kernel and data type usage profiles.
Improve TensorFlow Performance on AWS* Instances
Review inference benchmark results for several popular TensorFlow models (with and without oneDNN optimizations) on Amazon Web Services (AWS)* C6i instance types powered by 3rd generation Intel Xeon Scalable processors.
How to Accelerate TensorFlow on Intel Hardware
Accelerate deep learning inference by applying default optimizations in TensorFlow for Intel hardware and quantizing to int8.
News
Accelerate TensorFlow on Intel® Data Center GPU Flex Series
Google* and Intel coarchitected PluggableDevice, a mechanism that lets hardware vendors add device support by using plug-in packages that can be installed alongside TensorFlow. Intel Extension for TensorFlow is the newest PluggableDevice.
oneDNN AI Optimizations Enabled by Default in TensorFlow
Intel and Google team up to enable this library as the default back end CPU optimization for TensorFlow 2.9.
Meituan* Optimizes TensorFlow
China's leading e-commerce platform for lifestyle services boosted distributed scalability more than tenfold in its recommendation system scenarios.
Specifications
Processor:
- Intel Xeon Scalable processor
- Intel® Core™ processor
- Intel GPU
Operating systems:
- Linux*
- Windows*
Languages:
- Python
- C++
Deploy TensorFlow models to a variety of devices and operating systems with Intel® Distribution of OpenVINO™ Toolkit.
Get Help
Your success is our success. Access these support resources when you need assistance.

Stay Up to Date on AI Workload Optimizations
Sign up to receive hand-curated technical articles, tutorials, developer tools, training opportunities, and more to help you accelerate and optimize your end-to-end AI and data science workflows.
Take a chance and subscribe. You can change your mind at any time.