Software is essential to delivering on the promise of AI. Whether they are shipping production models or doing research, developers need optimizations to accelerate machine learning and deep learning algorithm performance. Having the latest software tools to speed development and deployment for heterogeneous hardware architectures enables developers to create new and compelling data-centric AI solutions wherever data lives.
Intel has long believed in democratizing technology and making it available to all developers, hence our long commitment to open-source technologies. We are taking a similar approach with AI on well-known, highly-adaptable Intel® architecture by working with a large community of open-source developers to deliver frameworks and optimizations for the next generation of AI solutions. Further, we’re cultivating the AI ecosystem to support developers through any rough patches.
One of the many communities we are excited to support drives the development of PyTorch*. On the occasion of the first PyTorch Developer Conference, we’d like to highlight some of Intel’s latest contributions, including Intel direct optimizations for PyTorch, nGraph Compiler with ONNIXIFI* support, and Neural Network Distiller, a PyTorch environment for neural network compression research.
What is PyTorch*?
Developed primarily by Facebook Artificial Intelligence, PyTorch is an open-source deep learning framework that is popular with AI researchers for its flexibility and ease-of-use. With the release of PyTorch 1.0, developers can now seamlessly move from research prototyping to production deployment in a single platform through the unification of the fast, native Caffe2* codebase and the existing PyTorch framework.
Intel Contributions to PyTorch
Intel continues to accelerate and streamline PyTorch on Intel architecture, most notably Intel® Xeon® Scalable processors, given their prevalence and capacity for many training workloads experienced by users today. We are doing this both using Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN) directly and making sure PyTorch is ready for our next generation of performance improvements both in software and hardware through the nGraph Compiler. In addition, we are extending PyTorch in new ways through our quantization project Distiller.
Intel Direct Optimizations for PyTorch Provide Inference Throughput Increases
Intel has optimized deep learning operators in the PyTorch and Caffe2 backends using Intel MKL-DNN. Common model optimization techniques such as constant folding and operator fusion are also supported to speed up computation further.
These optimization techniques leverage Intel Xeon Scalable processors with multiple cores and vectorization, boosting the performance of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) compared with the vanilla PyTorch/Caffe2 CPU backend. In addition, Intel also has filed a pull request to add INT8 low precision inference into the Caffe2 native code-paths, which are now part of PyTorch 1.0, in order to leverage hardware acceleration support in Intel Xeon Scalable processors. Low-precision inference enables more operations per second, reduces the memory access pressure, and better utilizes the cache to deliver higher inference throughput with lower latency. The feature will appear in PyTorch master shortly.
PyTorch Models Deliver Notable Performance Gains Using ONNXIFI* and nGraph Compiler
The use of PyTorch with a graph compiler like Intel’s nGraph Compiler provides many opportunities for further deep learning optimizations in addition to those offered by Intel MKL-DNN. The nGraph Compiler is the first compiler to support both inference and training workloads across multiple frameworks and hardware architectures.
The nGraph Compiler already supports PyTorch models through the Open Neural Network Exchange* (ONNX*) model description format, and we are pleased to announce our continued investment in the community as one of the first hardware vendors to support the ONNX Interface for Framework Integration* (ONNIXIFI*). ONNXIFI is a cross-platform API for loading and executing ONNX graphs on optimized backends, providing yet another path to help developers realize the power of AI on Intel architecture.
Enhancing PyTorch Using ONNXIFI* and the nGraph Compiler
PyTorch is well-received in the deep learning community for its ease of use and flexibility. At Intel, we’ve been working hard to ensure that PyTorch is ready for the next generation of Intel hardware and performance optimizations by contributing to the Open Neural Network Exchange* (ONNX*) .
The nGraph Compiler already supports running PyTorch models through the ONNX model description format, and we are pleased to make this support even easier for users by becoming one of the first hardware vendors to support the ONNX Interface for Framework Integration* (ONNIXIFI*). ONNXIFI is a cross-platform API for loading and executing ONNX formatted models on optimized backends. ONNXIFI support for PyTorch lays the foundation for data scientists to benefit from new hardware and optimizations without any explicit model conversion step. Everything happens seamlessly behind the scenes.
This means that PyTorch developers can soon leverage Intel’s nGraph Compiler for further deep learning optimizations, augmenting those offered by Intel MKL-DNN. nGraph Compiler is the first compiler to support both inference and training workloads across multiple frameworks and hardware architectures, Once ONNX supports training, developers will be able to further benefit from nGraph Compiler’s training optimizations.
We currently have two PyTorch models (ResNet50 and Super Resolution) functioning through ONNXIFI and nGraph Compiler, and as we continue to add more topology support, nGraph Compiler will provide not only a performance boost, but also the option to deploy to an array of devices, from edge to cloud.
Compressing Deep Learning Models with Neural Network Distiller
Deep learning has gained recent favor in many industries, including entertainment, healthcare, energy, and finance. This implies that deep learning applications need to execute on the full span of the compute spectrum, from edge devices to the cloud. Increasingly, it is important for researchers and businesses to explore advancements in neural network compression techniques to best fit the unique needs of each environment.
With these needs in mind, Intel made the Neural Network Distiller, a PyTorch environment for neural network compression research, available to the open source community earlier this year. Distiller brings together state-of-the-art DNN compression algorithms from the research community and PyTorch sample applications to demonstrate the applicability and effectiveness of the algorithms in different task settings. The algorithms span a wide range of disciplines: pruning, quantization, regularization, knowledge distillation, conditional computation, low-rank approximation, and automated design-space exploration.
Distiller aims to be a community project where researchers compare their algorithms to existing methods, and deep learning practitioners try compression methods for their specific problems. We selected PyTorch for this project because of its ease of use and familiarity to the deep learning research and data science community and because it supports ONNX, which means that research projects can readily be migrated to a production environment.
Get Started with PyTorch on Intel® Architecture
Please visit GitHub* to get started with PyTorch on Intel architecture. We encourage you to give us feedback or ask questions through GitHub and StackOverflow* and invite you to contribute to the project yourself as well. We are here to help.
At Intel, we are proud to support the thriving community around PyTorch. We look forward to meeting with friends and collaborators at the first and future PyTorch Developer Conferences as well as continuing to support this ecosystem on the road to PyTorch 1.0.
Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com.
Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries.
*Other names and brands may be claimed as the property of others.
© Intel Corporation