Developer Guide

Contents

The
pipe
Class and its Use

The
pipe
Class and its Use

The
pipe
API exposed by the DPC++ FPGA implementation is equivalent to the following class declaration:
template <class name, class dataT, size_t min_capacity = 0> class pipe { public: // Blocking static dataT read(); static void write(dataT data); // Non-blocking static dataT read(bool &success_code); static void write(dataT data, bool &success_code); }
The following table describes the
template
parameters:
Template Parameters
Parameter
Description
name
The type that is the basis of a pipe identification. It is typically a user-defined class, in a user namespace. Forward declaration of the type is enough, and the type need not be defined.
dataT
The type of data packet contained within a pipe. This is the data type that is read during a successful pipe
read()
operation, or written during a successful pipe
write()
operation. The type must have a standard layout and be trivially copyable.
min_capacity
User-defined minimum number of words (in units of
dataT
) that the pipe must be able to store without any being read out. The compiler may create a pipe with a larger capacity due to performance considerations.
The
pipe
class exposes static methods for writing a data word to a pipe and reading a data word from a pipe. The reads and writes can be blocking or non-blocking, with the form chosen based on the overload resolution.
A data word in this context is the data type that the pipe contains (
dataT pipe template
argument).
Example Code Using Blocking Inter-Kernel Pipes
When writing code with DPC++ pipes, use of the C++ type alias mechanism (
using
) is highly encouraged to avoid errors where slightly different pipe types inadvertently lead to unique pipes. The following code sample shows how to use pipes with blocking accessors to transfer data between two kernels:
#include <CL/sycl.hpp> using namespace cl::sycl; constexpr int N = 3; // Specialize a pipe type using my_pipe = ext::intel::pipe<class some_pipe, int, 8>; void producer(const std::array<int, N> &src) { queue q; // Launch the producer kernel buffer<int> src_buf = {std::begin(src), std::end(src)}; q.submit([&](handler &cgh) { // Get read access to src array accessor rd_src_buf(src_buf, cgh, read_only); cgh.single_task<class producer>([=]() { for (int i = 0; i < N; i++) { // Blocking write an int to the pipe my_pipe::write(rd_src_buf[i]); } }); }); } void consumer(std::array<int, N> &dst) { queue q; // Launch the consumer kernel buffer<int> dst_buf = {std::begin(dst), std::end(dst)}; q.submit([&](handler &cgh) { // Get write access to dst array accessor wr_dst_buf(dst_buf, cgh, write_only); cgh.single_task<class consumer>([=]() { for (int i = 0; i < N; i++) { // Blocking read an int from the pipe wr_dst_buf[i] = my_pipe::read(); } }); }); }
The pipe data packet is of type
int
and the pipe has a depth of 8, as specified by the template parameters of
my_pipe
type. The pipe
read()
call blocks only when the pipe is empty, and the pipe
write()
call blocks only when the pipe is full.
The DPC++ specification does not guarantee concurrent kernel execution. However, the
Intel® oneAPI
DPC++/C++
Compiler
supports concurrent execution of kernels. You can execute multiple DPC++ kernels concurrently by launching them using separate command queues (as shown in the above Example Code Using Blocking Inter-Kernel Pipes). Hence, you can modify your host application and kernel program to take advantage of this capability. The modifications increase the throughput of your application.
Example Code Using Non-Blocking Inter-Kernel Pipes
The code samples (Sample 1 and Sample 2) in this section illustrate how to use pipes with non-blocking writes and reads to transfer data between two concurrently running kernels:
//Sample 1 #include <CL/sycl.hpp> using namespace cl::sycl; constexpr size_t N = 16; // Specialize the two pipe types, differentiated based on their // first template parameter using pipe1 = ext::intel::pipe<class some_pipe, int>; using pipe2 = ext::intel::pipe<class other_pipe, int>; void producer(const std::array<int, N> &src) { queue q; // Launch the producer kernel // Get read access to src array buffer<int> src_buf = {std::begin(src), std::end(src)}; q.submit([&](handler &cgh) { accessor rd_src_buf(src_buf, cgh, read_only); cgh.single_task<class producer>([=]() { for (int i = 0; i < N; i++) { bool success = false; do { pipe1::write(rd_src_buf[i], success); if (!success) { pipe2::write(rd_src_buf[i], success); } } while (!success); } }); }); } // the consumer kernels are not shown here
For both pipes, the data packet is of type
int
. The pipes are different because the first
template
parameter is different. The non-blocking pipe
write()
and
read()
calls do not block. They respectively return a boolean value that indicates whether data is written successfully to the pipe (that is, the pipe is not full) or if the data is read successfully from the pipe (that is, the pipe is not empty).
Perform non-blocking pipe writes to facilitate applications where writes to a full FIFO buffer should not cause the kernel to stall until a slot in the FIFO buffer becomes free. Consider a scenario where your application has one data producer with two identical workers that consume the data. Assume the time each worker takes to process a message varies depending on the contents of the data. In this case, there might be a situation where one worker is busy while the other is free. A non-blocking write can facilitate work distribution such that both workers are busy. Like a non-blocking write, perform non-blocking reads to facilitate applications where data is not always available, and other operations need not wait for the data to become available.
You can mix blocking and non-blocking accessors for writing or reading data to or from pipes. For example, you can write data to a pipe using a blocking pipe
write()
call and read it from the other end using a non-blocking pipe
read()
call, and vice versa.
//Sample 2 #include <CL/sycl.hpp> using namespace cl::sycl; constexpr size_t N = 16; // Specialize the two pipe types, differentiated based on their first template // parameter using pipe1 = ext::intel::pipe<class some_pipe, int>; using pipe2 = ext::intel::pipe<class other_pipe, int>; // the producer kernels are not shown void consumer(const std::array<int, N> &dst) { queue q; // Launch the consumer kernel buffer<int> dst_buf = {std::begin(dst), std::end(dst)}; q.submit([&](handler &cgh) { // Get write access to src array accessor wr_dst_buf(dst_buf, cgh, write_only); cgh.single_task<class consumer>([=]() { int = 0; while (i < N) { bool valid0 = false, valid1 = false; auto data0 = pipe1::read(valid0); auto data1 = pipe2::read(valid1); if (valid0) { wr_dst_buf[i++] = process(data0); } if (valid1) { wr_dst_buf[i++] = process(data1); } } }); }); }
For additional information, refer to FPGA tutorial samples "pipe_array" and "pipes" listed in the Intel® oneAPI Samples Browser on Linux* or Intel® oneAPI Samples Browser on Windows*, or access the code samples in GitHub.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.