Intel® Optimization for PyTorch*
Speed Up AI from Research to Production Deployment
Maximize PyTorch* Performance on Intel® Hardware
PyTorch* is an AI and machine learning framework popular for both research and production usage. This open source library is often used for deep learning applications whose compute-intensive training and inference test the limits of available hardware resources.
Speed up model development and deployment performance on Intel hardware with software optimizations built into open source PyTorch.
With a few lines of code, Intel® Extension for PyTorch* enables the most up-to-date Intel software and hardware optimizations for AI.
Using this framework with Intel optimizations, you can:
- Develop, train, and deploy AI models using a Python* or C++ API.
- Automatically accelerate PyTorch-based training and inference performance on Intel hardware.
- Extend PyTorch to further accelerate performance on Intel hardware with minimal code changes.
Download as Part of the Toolkit
PyTorch and Intel Extension for PyTorch are available in the Intel® AI Analytics Toolkit, which provides accelerated machine learning and data analytics pipelines with optimized deep learning frameworks and high-performing Python libraries.
Develop in the Free Intel® Cloud
Get what you need to build and optimize your oneAPI projects for free. With an Intel® Developer Cloud account, you get 120 days of access to the latest Intel® hardware—CPUs, GPUs, FPGAs—and Intel® oneAPI tools and frameworks. No software downloads. No configuration steps. No installations.
Download the Stand-Alone Versions
Stand-alone versions of PyTorch and Intel Extension for PyTorch are available. You can install them using a package manager or build from the source.
Help Intel Extension for PyTorch Evolve
This open source component has an active developer community. We welcome you to participate.
Open Source Version (GitHub*)
Features
PyTorch Machine Learning Framework
- Create, train, and deploy deep learning models using a Python or C++ API.
- Transition from interactive development in eager mode to fast batch runtimes with graph mode.
- Speed up model development with built-in support for distributed training on a variety of platforms.
- Deploy PyTorch models to production servers with TorchServe.
Intel® Optimizations
- Accelerate PyTorch model performance with Intel® oneAPI Deep Neural Network Library features such as graph and node optimizations.
- Automatically use Intel® Deep Learning Boost instruction set features to parallelize and accelerate PyTorch workloads.
- Reduce inference latency for models deployed with TorchServe.
- Perform distributed training with oneAPI Collective Communications Library Bindings for Pytorch*.
Intel® Extension for PyTorch* Optimizations and Features
- Apply the newest performance optimizations not yet in PyTorch using Python API commands.
- Vectorize operations to take advantage of larger register sizes in Intel® Advanced Vector Extensions 2, Intel® Advanced Vector Extensions 512, and Intel® Advanced Matrix Extensions instruction sets.
- Parallelize operations without having to analyze task dependencies.
- Further improve vectorization by quantizing to smaller word lengths such as bfloat16 (BF16) or INT8.
- Use built-in recipes to balance quantization efficiency with minimal accuracy loss.
- Fuse common FP32 and BF16 graph operations such as Conv2D+ReLU or Linear+ReLU.
- Fold mathematical graph operations with a convolution.
- Control aspects of the thread runtime such as multistream inference and asynchronous task spawning.
Benchmarks
Documentation & Code Samples
Documentation
- PyTorch Documentation
- PyTorch Performance Tuning Guide
- Intel Extension for PyTorch
- Intel® oneAPI Collective Communications Library (oneCCL) Bindings for PyTorch
Demonstrations
Achieve Up to 1.77x Boost Ratio for Your AI Workloads
Learn the difference between stock PyTorch and the Intel Extension for PyTorch, followed by in-depth explanations of the key techniques that power this extension.
Increase PyTorch Inference Throughput by 4x
See how to accelerate PyTorch-based inferencing by applying optimizations from the Intel Extension for PyTorch and quantizing to INT8.
Accelerate MedMNIST Training and Inference with Intel Extension for PyTorch
See how to use Intel Extension for PyTorch for training and inference on the MedMNIST datasets. It is compared against stock PyTorch and shows the performance gain that Intel Extension for PyTorch offers.
Speed Training 8x Using PyTorch with a oneCCL Back End
Compare the performance of distributed training of the deep learning recommendation model (DLRM) using oneCCL and other leading back-ends.
Case Studies
AI-Based Customer Service Automation—Conversations in the Cloud
MindTitan* and Intel worked together to optimize their TitanCS solution using Intel Extension for PyTorch, achieving improvements on inference performance running on Intel® CPUs and driving better real-time call analysis.
KT Optimizes Performance for Personalized Text-to-Speech
Technologists from KT (formerly Korea Telecom) and Intel worked together to optimize performance of the company’s P-TTS service. The optimized CPU-based solution increased real-time function (RTF) performance by 22 percent while maintaining voice quality and number of connections.
News
Accelerate Deep Learning with Intel Extension for PyTorch
See how to use Intel Extension for PyTorch to take advantage of optimizations before they become part of a stock PyTorch release. Apply the newest developments to optimize your PyTorch models running on Intel® hardware.
Accelerate bfloat16 PyTorch Models
Get an introduction to Intel Extension for PyTorch and Intel® oneAPI Deep Neural Network Library (oneDNN) with a close look into the technology behind the Intel Extension for PyTorch API and graph fusion optimizations.
Intel and Facebook* Accelerate PyTorch Performance
Facebook* and Intel collaborated to improve PyTorch performance on 3rd generation Intel® Xeon® Scalable processors by harnessing the new bfloat16 capability in Intel® Deep Learning Boost, and deliver training and inference performance boosts for a variety of model and data types.
Intel and Facebook Collaborate to Boost PyTorch CPU Performance
Learn how Intel software optimizations accelerate PyTorch on Intel CPU hardware.
Specifications
Processor:
- Intel Xeon Scalable processor
Operating systems:
- Linux* (Intel Extension for PyTorch is for Linux only)
- Windows*
Languages:
- Python
- C++
Get Help
Your success is our success. Access these support resources when you need assistance.
Stay in the Know with All Things CODE
Sign up to receive the latest trends, tutorials, tools, training, and more to
help you write better code optimized for CPUs, GPUs, FPGAs, and other
accelerators—stand-alone or in any combination.