Intel® Optimization for PyTorch*
Speed Up AI from Research to Production Deployment
Maximize PyTorch Performance on Intel Hardware
PyTorch* is an AI and machine learning framework popular for both research and production usage. This open source library is often used for deep learning applications whose compute-intensive training and inference test the limits of available hardware resources.
Intel releases its newest optimizations and features in Intel® Extension for PyTorch* before upstreaming them into open source PyTorch.
With a few lines of code, you can use Intel Extension for PyTorch to:
- Take advantage of the most up-to-date Intel software and hardware optimizations for PyTorch.
- Automatically mix different precision data types to reduce the model size and computational workload for inference.
- Add your own performance customizations using APIs.
Intel also works closely with the open source PyTorch project to optimize the PyTorch framework for Intel hardware. All of these optimizations are collectively referred to as Intel Optimization for PyTorch.
Intel Optimization for PyTorch is part of the end-to-end suite of Intel® AI and machine learning development tools and resources.
Download as Part of the Toolkit
PyTorch and Intel Extension for PyTorch are available in the Intel® AI Analytics Toolkit, which provides accelerated machine learning and data analytics pipelines with optimized deep learning frameworks and high-performing Python libraries.
Develop in the Cloud
Build and optimize oneAPI multiarchitecture applications using the latest optimized Intel® oneAPI and AI tools, and test your workloads across Intel® CPUs and GPUs. No hardware installations, software downloads, or configuration necessary. Free for 120 days with extensions possible.
Download the Stand-Alone Versions
Stand-alone versions of PyTorch and Intel Extension for PyTorch are available. You can install them using a package manager or build from the source.
Help Intel® Extension for PyTorch* Evolve
This open source component has an active developer community. We welcome you to participate.
Open Source Version (GitHub*)
Features
Open Source PyTorch Powered by Intel® Optimization
Accelerate PyTorch training and inference with Intel® oneAPI Deep Neural Network Library (oneDNN) features such as graph and node optimizations.
- Take advantage of Intel® Deep Learning Boost, Intel® Advanced Vector Extensions (Intel® AVX-512), and Intel® Advanced Matrix Extensions (Intel® AMX) instruction set features to parallelize and accelerate PyTorch workloads.
- Speed up turnaround time on Intel hardware from interactive development to batch training and inference.
- Perform distributed training with oneAPI Collective Communications Library (oneCCL) bindings for PyTorch.
- Reduce inference latency for models deployed to production servers with TorchServe.
Intel Extension for PyTorch Optimizations and Features
- Apply the newest performance optimizations not yet in PyTorch using Python API commands.
- Parallelize operations without having to analyze task dependencies.
- Automatically mix operator data type precision between float32 and bfloat16 to reduce computational workload and model size.
- Fuse and optimize frequently used convolution operations.
- Convert to a channels-last memory format for faster image-based deep learning performance.
- Control aspects of the thread runtime such as multistream inference and asynchronous task spawning.
- Run PyTorch on Intel GPU hardware.
Benchmarks
Documentation & Code Samples
- PyTorch Documentation
- PyTorch Performance Tuning Guide
- Intel Extension for PyTorch
- TorchServe with Intel Extension for PyTorch
- oneCCL Bindings for PyTorch
Intel Extension for PyTorch Code Samples
- Single-Instance Training
- bfloat16 Inference—Imperative Mode
- bfloat16 Inference—TorchScript Mode
- int8 Deployment—Graph Mode
- C++ Dynamic Library
- GPU Single-Instance Training
- GPU Inference
Training & Tutorials
Visual Quality Inspection for the Pharmaceutical Industry
Get Started with Intel Extension for PyTorch
Optimize PyTorch* Performance on the Latest Intel® CPUs and GPUs
Hands-On Workshop: Accelerate PyTorch Applications Using Intel® oneAPI Toolkit
Optimize the Latest Deep Learning Workloads Using Intel Optimization for PyTorch
How to Improve TorchServe Inference Performance with Intel Extension for PyTorch
Demonstrations
Achieve Up to 1.77x Boost Ratio for Your AI Workloads
Learn the difference between stock PyTorch and the Intel Extension for PyTorch, followed by in-depth explanations of the key techniques that power this extension.
Increase PyTorch Inference Throughput by 4x
See how to accelerate PyTorch-based inferencing by applying optimizations from the Intel Extension for PyTorch and quantizing to int8.
Accelerate MedMNIST Training and Inference with Intel Extension for PyTorch
See how to use Intel Extension for PyTorch for training and inference on the MedMNIST datasets. It is compared against stock PyTorch and shows the performance gain that Intel Extension for PyTorch offers.
Speed Training 8x Using PyTorch with a oneCCL Back End
Compare the performance of distributed training of the deep learning recommendation model (DLRM) using oneCCL and other leading back-ends.
Case Studies
AI-Based Customer Service Automation—Conversations in the Cloud
MindTitan* and Intel worked together to optimize their TitanCS solution using Intel Extension for PyTorch, achieving improvements on inference performance running on Intel CPUs and driving better real-time call analysis.
KT Optimizes Performance for Personalized Text-to-Speech
Technologists from KT (formerly Korea Telecom) and Intel worked together to optimize performance of the company’s P-TTS service. The optimized CPU-based solution increased real-time function (RTF) performance by 22 percent while maintaining voice quality and number of connections.
News
Introducing Intel Extension for PyTorch for GPUs
This extension now supports Intel GPUs. Learn which features are supported in this release, how to install it, and how to get started running PyTorch on Intel GPUs.
Empower PyTorch on Intel® Xeon® Scalable processors with bfloat16
Intel and Meta continue to collaborate to improve PyTorch bfloat16 performance by taking advantage of Intel AVX-512 and Intel AMX instruction set extensions.
What Is New in Intel Extension for PyTorch
This presentation from the PyTorch Conference 2022 provides insight into the software optimizations and features (such as GPU support) that are introduced in Intel Extension for PyTorch and upstreamed to open source PyTorch over time.
PyTorch v1.13: A New Potential to Enhance Model Performance and Accuracy
Monitor and improve application performance with new Intel Optimizations and features in the open source framework and in Intel Extension for PyTorch.
Accelerate Deep Learning with Intel Extension for PyTorch
See how to use Intel Extension for PyTorch to take advantage of optimizations before they become part of a stock PyTorch release. Apply the newest developments to optimize your PyTorch models running on Intel hardware.
Intel and Facebook* Accelerate PyTorch Performance
Facebook* and Intel collaborated to improve PyTorch performance on 3rd generation Intel® Xeon® Scalable processors by harnessing the new bfloat16 capability in Intel® Deep Learning Boost, and deliver training and inference performance boosts for a variety of model and data types.
Specifications
Processors:
- Intel Xeon processor
- Intel® Core™ processor
- Intel® Data Center GPU Flex Series
Operating systems:
- Linux* (Intel Extension for PyTorch is for Linux only)
- Windows*
Languages:
- Python
- C++
Get Help
Your success is our success. Access these support resources when you need assistance.
Stay in the Know with All Things CODE
Sign up to receive the latest trends, tutorials, tools, training, and more to
help you write better code optimized for CPUs, GPUs, FPGAs, and other
accelerators—stand-alone or in any combination.