Intel® oneAPI Collective Communications Library Developer Guide and Reference
A newer version of this document is available. Customers should click here to go to the newest version.
Host Communication
The communication operations between processes are provided by Communicator.
The example below demonstrates the main concepts of communication on host memory buffers.
Example
Consider a simple oneCCL allreduce example for CPU.
Create a communicator object with user-supplied size, rank, and key-value store:
auto ccl_context = ccl::create_context(); auto ccl_device = ccl::create_device(); auto comms = ccl::create_communicators( size, vector_class<pair_class<size_t, device>>{ { rank, ccl_device } }, ccl_context, kvs);
Or for convenience use non-vector form without device and context parameters.
auto comm = ccl::create_communicator(size, rank, kvs);
Initialize send_buf (in real scenario it is supplied by the user):
const size_t elem_count = <N>; /* initialize send_buf */ for (idx = 0; idx < elem_count; idx++) { send_buf[idx] = rank + 1; }
allreduce invocation performs the reduction of values from all the processes and then distributes the result to all the processes. In this case, the result is an array with elem_count elements, where all elements are equal to the sum of arithmetical progression:
ccl::allreduce(send_buf, recv_buf, elem_count, reduction::sum, comm).wait();
Check the correctness of allreduce operation:
auto comm_size = comm.size(); auto expected = comm_size * (comm_size + 1) / 2; for (idx = 0; idx < elem_count; idx++) { if (recv_buf[idx] != expected) { std::count << "unexpected value at index " << idx << std::endl; break; } }