Scalable & Efficient Distributed Training for Deep Neural Networks
Implement Multi-Node Communication Patterns
The Intel® oneAPI Collective Communications Library (oneCCL) enables developers and researchers to more quickly train newer and deeper models. This is done by using optimized communication patterns to distribute model training across multiple nodes.
The library is designed for easy integration into deep learning frameworks, whether you are implementing them from scratch or customizing existing ones.
Built on top of lower-level communication middleware. Message passing interface (MPI) and libfabrics transparently support many interconnects, such as Cornelis Networks*, InfiniBand*, and Ethernet.
Optimized for high performance on Intel® CPUs and GPUs.
Allows the tradeoff of compute for communication performance to drive scalability of communication patterns.
Enables efficient implementations of collectives that are heavily used for neural network training, including all-gather, all-reduce, and reduce-scatter.
Download as Part of the Toolkit
oneCCL is included as part of the Intel® oneAPI Base Toolkit, which is a core set of tools and libraries for developing high-performance, data-centric applications across diverse architectures.
Get what you need to build and optimize your oneAPI projects for free. With an Intel® Developer Cloud account, you get 120 days of access to the latest Intel® hardware—CPUs, GPUs, FPGAs—and Intel® oneAPI tools and frameworks. No software downloads. No configuration steps. No installations.