Developer Guide

  • 2021.7.1
  • 09/08/2022
  • Public Content

Pass Data to Algorithms

You can use one of the following ways to pass data to an algorithm executed with a device policy:
  • oneapi:dpl::begin
    and
    oneapi::dpl::end
    functions
  • Unified shared memory (USM) pointers and
    std::vector
    with USM allocators
  • Iterators of host-side
    std::vector

Use oneapi::dpl::begin and oneapi::dpl::end Functions

oneapi::dpl::begin
and
oneapi::dpl::end
are special helper functions that allow you to pass SYCL buffers to parallel algorithms. These functions accept a SYCL buffer and return an object of an unspecified type that provides the following API:
  • It satisfies
    CopyConstructible
    and
    CopyAssignable
    C++ named requirements and comparable with
    operator==
    and
    operator!=
    .
  • It gives the following valid expressions:
    a + n
    ,
    a - n
    , and
    a - b
    , where
    a
    and
    b
    are objects of the type, and
    n
    is an integer value. The effect of those operations is the same as for the type that satisfies the
    LegacyRandomAccessIterator
    , a C++ named requirement.
  • It provides the
    get_buffer
    method, which returns the buffer passed to the
    begin
    and
    end
    functions.
The
begin
and
end
functions can take SYCL 2020 deduction tags and
sycl::no_init
as arguments to explicitly mention which access mode should be applied to the buffer accessor when submitting a SYCL kernel to a device. For example:
auto first1 = begin(buf, sycl::read_only); auto first2 = begin(buf, sycl::write_only, sycl::no_init); auto first3 = begin(buf, sycl::no_init);
The example above allows you to control the access mode for the particular buffer passing to a parallel algorithm.
To use the functions, add
#include <oneapi/dpl/iterator>
to your code. For example:
#include <oneapi/dpl/execution> #include <oneapi/dpl/algorithm> #include <oneapi/dpl/iterator> #include <CL/sycl.hpp> int main(){ sycl::buffer<int> buf { 1000 }; auto buf_begin = oneapi::dpl::begin(buf); auto buf_end = oneapi::dpl::end(buf); std::fill(oneapi::dpl::execution::dpcpp_default, buf_begin, buf_end, 42); return 0; }

Use Unified Shared Memory

The following examples demonstrate two ways to use the parallel algorithms with USM:
  • USM pointers
  • USM allocators
If you have a USM-allocated buffer, pass the pointers to the start and past the end of the buffer to a parallel algorithm. Make sure that the execution policy and the buffer were created for the same queue. For example:
#include <oneapi/dpl/execution> #include <oneapi/dpl/algorithm> #include <CL/sycl.hpp> int main(){ sycl::queue q; const int n = 1000; int* d_head = sycl::malloc_shared<int>(n, q); std::fill(oneapi::dpl::execution::make_device_policy(q), d_head, d_head + n, 42); sycl::free(d_head, q); return 0; }
Alternatively, use
std::vector
with a USM allocator. For example:
#include <oneapi/dpl/execution> #include <oneapi/dpl/algorithm> #include <CL/sycl.hpp> int main(){ const int n = 1000; auto policy = oneapi::dpl::execution::dpcpp_default; sycl::usm_allocator<int, sycl::usm::alloc::shared> alloc(policy.queue()); std::vector<int, decltype(alloc)> vec(n, alloc); std::fill(policy, vec.begin(), vec.end(), 42); return 0; }
When using device USM, such as allocated by
malloc_device
, manually copy data to this memory before calling oneDPL algorithms, and copy it back once the algorithms have finished execution.

Use Host-Side std::vector

Intel® oneAPI DPC++ Library parallel algorithms can be called with ordinary (host-side) iterators, as seen in the example below. In this case, a temporary SYCL buffer is created, and the data is copied to this buffer. After processing on a device is complete, the modified data is copied from the temporary buffer back to the host container. For example:
#include <oneapi/dpl/execution> #include <oneapi/dpl/algorithm> #include <vector> int main(){ std::vector<int> vec( 1000 ); std::fill(oneapi::dpl::execution::dpcpp_default, vec.begin(), vec.end(), 42); // each element of vec equals to 42 return 0; }
Working with SYCL buffers is recommended to reduce data copying between the host and device.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.