Developer Guide

  • 2021.1
  • 11/03/2021
  • Public
Contents

Analyze Measurements Offline

In addition to, or instead of, analyzing the measurement results directly in your workload (Analyze Measurements in Your Workload), it is often helpful to store the raw measurement results for offline analysis by a separate application.
In this scenario you first run the real-time application until it finishes, and then you can analyze the results on the same or different system, while your real-time application is offline. This is in contrast to online measurement of a running application as described in Monitor Measurements with an Application.
The measurement library collector can be configured to store the measurement results automatically when the instrumented application closes. The environment variables that control the output are described in Control Data Collection.
The following diagram demonstrates the flow for this scenario:
The real-time application is instrumented with ITT APIs and it is linked against the ITT Notify static library (
libittnotify64.a
). At runtime, the static library reads the environment variable
INTEL_LIBITTNOTIFY64
and loads the measurement library collector (
libtcc_collector.so
), a dynamic library. The measurement library collector initializes the structures for data collection and stores the latency measurements there.
When the application closes, the measurement library collector reads the environment variables
TCC_MEASUREMENTS_DUMP_FILE
and
TCC_MEASUREMENTS_TIME_UNIT
and saves the results according to the environment settings.
The
libtcc.so
shared library is linked by the measurement library collector to handle internal function calls.
The alternative approach is to access the measurement structure as described in Analyze Measurements in Your Workload and store the results manually. This can be useful if you want to store the results periodically to avoid creating huge buffers in your application or want to implement some other advanced handling of the measurement results.

Example Workflow

The following example shows how to store individual measurements as they are collected.
By analyzing the measurements, you can see the distribution of execution times. For example, if you observe a significant difference between the average and maximum latency measurements (in other words, high jitter), analyzing the distribution of measurements can help to identify the source of the jitter. If there are only a couple of outliers, then there is some irregular activity interfering with the workload, possibly an interrupt handler. If the distribution is more flat, the variance is more likely coming from the application itself, such as different ratios of cache hits/misses between iterations. In this case, locking critical data with the help of the cache allocation library may help to reduce the jitter.
To help you visualize the data, the measurement analysis sample can create a histogram of the stored measurements (see Measurement Analysis Sample for more information).
The example contains the following steps:
  1. Add the ITT APIs to your application as described in Instrument the Code.
  2. Get the pointer to the measurement structure and run additional operations on the measurement results. This example dumps the collected measurement results into a file for offline analysis.
    Parameters:
    • tcc_measurement_ptr
      : Pointer to the measurement structure.
    • out
      : Pointer to the output file. The
      dump_file
      is a standard C/C++ FILE* handle for the file opened for writing:
      dump_file = fopen(dump_file_name, "w")
      .
    • timeunit
      : Time unit for printing measurement data.
    struct tcc_measurement* tcc_measurement_ptr; int tcc_sts; tcc_sts = tcc_measurement_get(domain, measurement, &tcc_measurement_ptr); if (tcc_sts == TCC_E_SUCCESS) { tcc_measurement_print_history (tcc_measurement_ptr, dump_file, TCC_TU_US) }
    Do not access the
    tcc_measurement
    structure after the
    main()
    function finishes (in cases like destructors of global variables).
The following diagram demonstrates the flow for this scenario:
Starting on the left side, the diagram shows that the real-time application is instrumented with ITT APIs and it is linked against the ITT Notify static library (
libittnotify64.a
). At runtime, the static library reads the environment variable
INTEL_LIBITTNOTIFY64
and loads the measurement library collector (
libtcc_collector.so
), a dynamic library. The measurement library collector initializes the structures for data collection and stores the latency measurements there.
In addition, as shown on the right side of the diagram, the real-time application uses measurement library functions to access the data structures. In this case, the application is linked against the measurement library (
libtcc_static.a
), a static library. The measurement library reads the environment variable
INTEL_LIBITTNOTIFY64
and loads the measurement library collector (
libtcc_collector.so
). As a result, the application can access the data structures created in the measurement library collector and can call the function to output the results (
tcc_measurement_print_history()
). Alternatively, the environment variable
TCC_MEASUREMENTS_DUMP_FILE
can be used to specify the location of the dump file, so the results are automatically saved when the application is closed.
The
libtcc.so
shared library is linked by the measurement library collector and real-time application (through
libtcc_static.a
) to handle internal function calls.

Print Options

The measurement library provides the following print options:
  • tcc_measurement_print()
    outputs certain statistics to the console. The output includes measurement name, number of iterations, and latency values in the selected time unit. Example output:
    [Cycle] Iterations 9; iteration duration [us]: avg=1504.456 min=1484.536 max=1866.007 jitter=381.470
  • tcc_measurement_print_summary()
    outputs fields of the measurement structure with selected time unit and the measurements history (measurements buffer content) in JSON format. The output goes to the provided FILE* which can be a console or file. If a buffer has not been allocated, the measurements history will be an empty array; otherwise, it will contain an array of latency values with the size bounded by the measurement buffer size. When multiple measurement instances write to the same file, it will contain the corresponding number of JSON objects, but the collection of objects is not a valid JSON object itself. Example of the output (a JSON object) for a single measurement instance:
    {"measurement":"Cycle","time units":"clk","count":14859,"avg duration":537307.922,"min duration":513141.000,"max duration":606821.000,"delta":93680.000,"measurements_history": [538393, 520568, 518556, 579024, … , 513952, 566503]}
    The JSON objects can be parsed by any JSON parser.
  • tcc_measurement_print_history()
    outputs the measurements history in the selected time unit and clears the history afterward. The output goes to the provided FILE*. The output is a comma-separated list of values (same as the
    measurements_history
    content in
    tcc_measurement_print_summary
    ). Example output:
    Approximation: 371199453, 2570, 2566, 2568, 2570
    You can print several measurement results to one file. Example code:
    tcc_measurement_print_history (tcc_measurement_ptr, dump_file, TCC_TU_US) tcc_measurement_print_history (tcc_measurement_ptr_2, dump_file, TCC_TU_US) tcc_measurement_print_history (tcc_measurement_ptr_3, dump_file, TCC_TU_US)
    Example output:
    Cycle: 4690, 2976, 3002, 2948, 3009 Approximation: 2657, 2570, 2566, 2568, 2570 Multiplication: 404, 51, 53, 55, 49
  • tcc_measurement_print_last_record
    outputs the most recent measurement in the selected time unit in JSON format. The output goes to the provided FILE*. Example output:
    {"measurement":"Cycle","duration":537307.922}
Use either implicit output (TCC_MEASUREMENTS_DUMP_FILE described in Control Data Collection) or explicit output (“print”).

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.