Developer Guide

  • 2021.3
  • 11/18/2021
  • Public

When to Use the Data Streams Optimizer

Intel provides a tuning strategy to meet real-time application requirements. Tuning is divided into the following categories:
  • System software tuning
  • Power management tuning
  • Intel® TCC features tuning
  • Fabric tuning
Different types of tuning have different effects on latency, where
is the duration of time between two events and can mean any type of latency: I/O latency, buffer access latency, and the sum of these and others — network packet in to network packet out (cycle time). Use cases with more relaxed requirements need less tuning to meet their latency requirements, while use cases with stricter requirements necessitate substantial effort in tuning. The following diagram illustrates how different tuning categories help decrease the worst-case latency.
When tuning the platform, address higher-impact tuning first (system software tuning), then work down to lower-impact tuning until the requirements are satisfied. Types of tuning that are further to the right on the diagram are more complex and incur higher penalties to concurrent workloads than those on the left. However, they may unlock the best possible real-time performance when enabled and used in combination with the other types of tuning.
System software, such as the OS and drivers, has the highest impact on real-time performance. System software tuning is described in the Real-Time Tuning Guide, and is not configured by the data streams optimizer.
It is advisable to try Intel-provided system software comprised of the board support package (BSP) with real-time optimizations and reference BIOS with Intel® TCC Mode first. This “out of the box” tuning configuration can meet a certain level of real-time performance and provides baseline tuning of the top three categories: system software tuning, power management tuning, and Intel® TCC features tuning. If the Intel-provided system software doesn’t meet your needs, move to the more advanced level of tuning possible with the data streams optimizer, which provides a finer granularity of tuning and affects power management tuning, Intel® TCC features tuning, and is the only way to access fabric tuning.
Some power management tuning configured by the data streams optimizer include BIOS options already configured by Intel® TCC Mode. The data streams optimizer may change settings that improve real-time performance to settings that improve power-savings. This adjustment is made to achieve the right balance of performance and power dictated by system requirements. Intel® TCC Mode is recommended for maximum performance, but if power and thermal considerations are required, the data streams optimizer may configure Intel® TCC Mode settings (one option, a few options, or all options) to minimize power consumption as a tradeoff to latency performance.
To summarize:
Tuning Category
System Software Tuning
BSP with real-time optimizations
Power Management Tuning
Out-of-the-box tuning: BSP with real-time optimizations, Intel® TCC Mode in BIOS
Advanced tuning: Data streams optimizer
Intel® TCC Features Tuning
Out-of-the-box tuning: BSP with real-time optimizations, Intel® TCC Mode in BIOS
Advanced tuning: Data streams optimizer
Fabric Tuning
Data streams optimizer
The data streams optimizer will attempt to find the best balance of latency and power management by addressing power management tuning, Intel® TCC features tuning, and fabric tuning.
The data streams optimizer offers extremely fine-grained tuning. Performance improvements in data movement can be seen in the range of 20 microseconds to 2 microseconds for workloads with cycle times of less than 250 microseconds with single stream data movement between a PCIe endpoint and a processor core or memory. A similar level of improvement may also be seen for workloads with higher cycle times and multiple data movements between these end points. Improvements to this extent would only be visible for applications with highly tuned software, such that the software jitter is less than the 20 microseconds of tuning delivered by the data streams optimizer. If you can’t measure the data stream latency between these end points, the data streams optimizer tuning performance might not be easily observed. In this case, you might need to perform a deep inspection of your workloads and intra-workload latencies to determine if the data streams optimizer meets your needs.
Other Intel® TCC Tools capabilities, for example, cache configurator and cache allocation library, contribute to Intel® TCC features tuning. They are described elsewhere in this guide.

Product and Performance Information


Performance varies by use, configuration and other factors. Learn more at