Developer Guide

Contents

Latency Controls (Beta)

The
Intel® oneAPI
DPC++/C++
Compiler
allows you to set latency constraints between operations with side effects, such as pipes and LSUs, which are visible outside the kernel. Specifically, you can apply latency controls to pipe read/write and LSU load/store.
For stallable operations, the scheduler considers only the inherent latency of the operation without making any assumption about the actual stall time. The compiler strives to achieve the latency constraints. If it cannot achieve the latency controls, the compiler errors out.
Latency controls is a Beta feature currently. In a future release, its API will change.

Syntax

While latency controls is a Beta feature, you must declare the side effects (pipes and LSUs) that latency controls apply to in the
sycl::ext::intel::experimental
namespace. For example:
#include <sycl/ext/intel/fpga_extensions.hpp> using Pipe = sycl::ext::intel::experimental::pipe<class PipeClass, int, 8>;
You must set a latency constraint between an anchor and a non-anchor side-effect operation. You can specify an anchor and a constraint on a side-effect operation with the following two template arguments, which are also in the
sycl::ext::intel::experimental
namespace:
Template argument
Description
Example
sycl::ext::intel::experimental::latency_anchor_id<
N
>
Specifies the ID of the current side-effect operation where it behaves as an anchor.
N
is an integer and its default value is -1.
// This pipe read() performs as anchor 0 in latency control. Pipe::read<ext::intel::experimental::latency_anchor_id<0>>();
sycl::ext::intel::experimental::latency_constraint<
A, B, C
>
Specifies the latency constraint when the current side-effect operation behaves as a non-anchor, where:
  • A
    is an integer that specifies the ID of the target anchor defined on a different operation through the
    latency_anchor_id
    argument.
  • B
    is an
    enum
    value that specifies one of the control type from the set
    {type::exact, type::max, type::min}
    .
  • C
    is an integer that specifies the relative clock-cycle difference between the target anchor and the current side-effect operation that the constraint must infer subject to the control type (
    exact
    ,
    max
    , or
    min
    ). This relative cycle can be both positive and negative. A positive relative cycle means the anchor occurs before the current side effect.
// Set a latency constraint between anchor 0 and this pipe write(). // This pipe write() starts exactly 2 cycles after anchor 0 is done. Pipe::write<ext::intel::experimental::latency_constraint<0, ext::intel::experimental::type::exact, 2>>(...);
  • You can specify two template arguments in an arbitrary order. You need not always specify both template arguments together.
  • If you do not specify either of the template arguments, the compiler does not apply latency control.
Example
The following is an example of applying latency controls between side-effect operations:
#include <sycl/ext/intel/fpga_extensions.hpp> ... using namespace sycl; using Pipe1 = ext::intel::experimental::pipe<class PipeClass1, int, 8>; using Pipe2 = ext::intel::experimental::pipe<class PipeClass2, int, 8>; using BurstCoalescedLSU = ext::intel::experimental::lsu< ext::intel::experimental::burst_coalesce<false>, ext::intel::experimental::statically_coalesce<false>>; ... // Set read() as anchor 0. Pipe1::read<ext::intel::experimental::latency_anchor_id<0>>(); // write() starts exactly 2 cycles after anchor 0 read() is done. // Set write() as anchor 1. Pipe2::write<ext::intel::experimental::latency_constraint< 0, ext::intel::experimental::type::exact, 2>, ext::intel::experimental::latency_anchor_id<1>>(...); // store() starts at most 5 cycles after anchor 1 write() is done. BurstCoalescedLSU::store<ext::intel::experimental::latency_constraint< 1, ext::intel::experimental::type::max, 5>>(...);

Rules and Limitations

  • Anchor ID must be a non-negative number.
  • Anchor ID must be a unique number within the whole design.
  • Two endpoints of a constraint must meet one of the following conditions:
    • Both endpoints are not in any cluster.
    • Both endpoints are in the same cluster.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.