Developer Guide

  • 2021.2
  • 06/11/2021
  • Public
Contents

Control Data Collection

A collector library performs data collection and processing. The Instrumentation and Tracing Technology API (ITT API) defined in
ittnotify.h
loads a collector library during runtime, based on the value of the
INTEL_LIBITTNOTIFY64
environment variable. The measurement library includes a collector (
libtcc_collector.so
) to perform data collection and processing for real-time applications.
You can set the
INTEL_LIBITTNOTIFY64
environment variable as follows:
export INTEL_LIBITTNOTIFY64=libtcc_collector.so <additional environment variables to control data collection>
At the end of the command, you can add the following environment variables to control collection and processing:
Environment Variable
Description / Examples
TCC_USE_SHARED_MEMORY
Use a shared memory ring buffer instead of a local buffer to enable streaming of all measurement results to a separate application. Can be true or false. Default: false.
When the shared memory mode is enabled, the second (monitoring) application can attach to the shared buffer and process the measurement results.
TCC_MEASUREMENTS_BUFFERS
Define the buffers that will hold the measurements generated from your instrumented real-time application. If your application has multiple measurement instances, you can define a buffer for each instance. One buffer will be created per instance.
Full format description:
[<measurement_name>:<buffer_size>[:deadline]]
[,<measurement_name>:<buffer_size>[:deadline]]…
For each measurement instance in your application, specify:
  • Measurement name: The measurement name must be the same as the one specified in the
    __itt_string_handle_create
    call in your application. See Instrument the Code.
  • Buffer size in number of measurements: The maximum number of measurements that can be stored for this measurement instance. There is a difference in how the buffer overflow is handled, depending on the
    TCC_USE_SHARED_MEMORY
    option. For non-shared mode, when the buffer is full, newly arriving data will always overwrite the existing values, starting from the oldest (ring buffer). For shared mode, new data will overwrite the oldest only after the oldest data has been read by the second (monitoring) application; otherwise, new data will be skipped.
    When choosing a buffer size for your use case, consider a balance between the amount of data that you need to analyze and the memory consumption. Choosing a large buffer will provide more data for analysis, but also consume more application memory. For long measurements with billions of iterations, shared memory is the recommended approach, as it usually consumes significantly less memory. In this case, you need to balance the writing speed of the real-time application and the reading speed of the monitoring application to determine the optimal buffer size to avoid loss of measurements.
    The content of each buffer is stored in the file specified by
    TCC_MEASUREMENTS_DUMP_FILE
    when the program finishes. If buffer overflow occurred, only the last
    buffer_size
    measurements will be stored in the dump file.
  • (Optional) Deadline: The deadline is the maximum tolerable latency. The deadline value does not influence collector behavior directly. If you want to compare the maximum latency of the measurement instance against various deadlines without recompiling your application, it is recommended to specify the deadline via
    TCC_MEASUREMENTS_BUFFERS
    environment variable and then use the value in your code as shown in Deadline Handling. To specify the time unit of the deadline, use the
    TCC_MEASUREMENTS_TIME_UNIT
    environment variable. Default: CPU clock cycles. For more information, see Deadline Handling.
Examples:
  • One measurement instance without deadline:
    TCC_MEASUREMENTS_BUFFERS=Workload1:1000
  • One measurement instance with deadline:
    TCC_MEASUREMENTS_BUFFERS=Workload1:1000:10000
  • Two measurement instances with deadline:
    TCC_MEASUREMENTS_BUFFERS=Workload1:1000:65535,Workload2:2000:100000
  • Three measurement instances with or without deadline:
    TCC_MEASUREMENTS_BUFFERS=Workload1:1000:1000,Workload2:100,Workload3:100:5000
TCC_MEASUREMENTS_DUMP_FILE
File where measurements are logged after the program finishes. If the specified file does not exist, it will be created. If a file with the same name already exists, its contents will be overwritten. If the measurement library cannot open the specified file, nothing will be stored.
NOTE:
Use either implicit output (TCC_MEASUREMENTS_DUMP_FILE) or explicit output (“print” described in Post-process Analysis of Measurements). If you use both, you will get an empty dump file.
TCC_MEASUREMENTS_TIME_UNIT
Time unit that will be used to store the results. If the measurement library cannot parse
TCC_MEASUREMENTS_TIME_UNIT
, clk will be used. Options:
  • clk: CPU clock cycles
  • ns: nanoseconds
  • us: microseconds

Deadline Handling

Although the deadline value in
TCC_MEASUREMENTS_BUFFERS
is optional and is not used by the measurement library directly, it is useful when you want to compare the maximum latency of the measurement instance against various deadlines without recompiling your application.
In this case, specify the deadline via
TCC_MEASUREMENTS_BUFFERS
environment variable. In your code, query the value using
tcc_measurement_get_deadline_from_env()
and then set this deadline value when calling
tcc_measurement_set_deadline()
. For details about
tcc_measurement_set_deadline()
, see Set a Measurement Deadline.
Example of reading the deadline value from the environment:
/* Extract the deadline settings from the environment variable */ uint64_t deadline, deadline_raw; TCC_TIME_UNIT unit = tcc_measurement_get_time_unit_from_env(); deadline_raw = tcc_measurement_get_deadline_from_env(measurement_name); deadline = tcc_measurement_convert_time_units_to_clock(deadline_raw, unit); tcc_status = tcc_measurement_set_deadline(tcc_measurement_ptr, deadline, notify_deadline);

Examples of Controlling the Collector Behavior

The following examples show the behavior of the measurement library collector depending on the environment variables. The examples assume that you have instrumented your real-time application
myapp
with ITT APIs, and the application has two measurement instances: “Input” and “Compute.”
  • Example 1: One ring buffer in local memory
    The measurement library collector will create a local ring buffer for the measurement instance “Input.” The buffer can hold 1000 measurements. The collector will store the measurement results for the “Input” measurement instance in the buffer during application run. After the application is finished, the results will vanish.
    The following diagram demonstrates the data flow of this scenario:
    INTEL_LIBITTNOTIFY64=libtcc_collector.so TCC_MEASUREMENTS_BUFFERS=Input:1000 ./myapp
  • Example 2: Two ring buffers in local memory, store results to dump file on application finish
    The measurement library collector will create a local ring buffer for each measurement instance: “Input” and “Compute.” Each buffer can hold 1000 measurements. The collector will store the measurement results in the buffers during application run. After the application is finished, the last 1000 results of each measurement instance will be stored in the dump.txt file. The format is explained in Post-process Analysis of Measurements. The data will be stored in CPU clock cycles.
    The following diagram demonstrates the data flow of this scenario:
    INTEL_LIBITTNOTIFY64=libtcc_collector.so TCC_MEASUREMENTS_DUMP_FILE=dump.txt TCC_MEASUREMENTS_BUFFERS=Input:1000,Compute:1000 ./myapp
  • Example 3: Two ring buffers in shared memory, store results to dump file on application finish, measurements from shared buffers are read by monitoring application
    The measurement library collector will create a shared memory ring buffer for each measurement instance: “Input” and “Compute.” Each buffer can hold 10 measurements. The collector will store the measurement results in the buffers during application run. After the application is finished, the last 10 results of each measurement instance will be stored in the dump.txt file. The shared memory always keeps data in CPU clock cycles (raw measurements) while the output file (dump.txt) uses the specified time units. The deadline value does not influence collector behavior. You can decide how to use the value in your code.
    The monitoring application connects to the shared buffers and reads measurements from the shared buffers. You can decide how the monitoring application processes the values. For details about monitoring applications, see Monitor Measurements with an Application.
    The following diagram demonstrates the data flow of this scenario:
    INTEL_LIBITTNOTIFY64=libtcc_collector.so TCC_USE_SHARED_MEMORY=true TCC_MEASUREMENTS_TIME_UNIT=us TCC_MEASUREMENTS_DUMP_FILE=dump.txt TCC_MEASUREMENTS_BUFFERS=Input:10:10,Compute:10:100 ./myapp

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.