Developer Guide

  • 2021.2
  • 06/11/2021
  • Public
Contents

Run the Sample

You can run the sample to see how it behaves. To see how the sample behaves when the deadlines are exceeded, you can use the
--emulate-outliers (-o)
option, which will use a heavier workload for 20% of the iterations. This will lead to deadline violations.
To run this example:
  1. From your host system, connect to the target system:
    ssh <user>@<target>
  2. In the SSH session
    , run the sample with emulation of outliers:
    tcc_single_measurement_sample --approximation 10 --deadline 2000,ns --iterations 25 --emulate-outliers
    where:
    Option
    Description
    --approximation 10
    Calculates the 10th approximation of 2/pi.
    --deadline 2000,ns
    The maximum tolerable latency for each iteration is 2000 nanoseconds.
    --iterations 25
    Execute 25 iterations of the main loop.
    --emulate-outliers
    Enable emulation of outliers (amount about 20 percent).
Output Example
Running with arguments: approximation = 10, iterations = 25, outliers = True, deadline = 2000 ns, Latency exceeding deadline: 44820 CPU cycles (15988 nsec) (15 usec) Latency exceeding deadline: 83950 CPU cycles (29947 nsec) (29 usec) Latency exceeding deadline: 42502 CPU cycles (15161 nsec) (15 usec) Latency exceeding deadline: 64778 CPU cycles (23108 nsec) (23 usec) Latency exceeding deadline: 66416 CPU cycles (23692 nsec) (23 usec) Approximation #10 is:0.636620 [Approximation] Iterations 25; iteration duration [ns]: avg=4433.000 min=134.000 max=29947.000 jitter=29813.000 Number of exceeding deadlines: 5 of 25
First, the sample outputs the arguments with which it was launched.
When the deadline is exceeded, a corresponding message appears. The message shows the duration value in CPU clock cycles and time units.
Next, the result of the workload is displayed – an approximation of the number 2/pi.
The measurement result is displayed next:
  • [Approximation] – name of the measurement instance specified in the
    __itt_string_handle_create
    call
  • Iterations 25 – number of measurements
  • Iteration duration [ns]:
    • avg=4433.000 – average latency of all iterations
    • min=134.000 – minimum latency (execution time of the fastest iteration)
    • max=29947.000 – maximum latency (execution time of the slowest iteration)
    • jitter=29813.000 – difference between maximum and minimum values
  • Number of exceeding deadlines: 5 of 25 – The number of iterations for which the deadline was exceeded. In this example, the deadline was exceeded 5 times out of 25 measurements.
The specified deadline in nanoseconds is converted to CPU clock cycles for internal processing. The measurement result is displayed in specified time units, in this example in nanoseconds. For more information about time units conversion, see Convert Measurement Units.

Accessing Full Measurement Log

To get access to the full measurement results, you can enable a measurement buffer for the
Approximation
measurement instance and dump the data to a file.
To run this example:
  1. Run the following command to enable a measurement buffer and a dump file:
    TCC_MEASUREMENTS_BUFFERS=Approximation:25 TCC_MEASUREMENTS_DUMP_FILE=log.txt TCC_MEASUREMENTS_TIME_UNIT=ns tcc_single_measurement_sample --approximation 10 --deadline 2000,ns --iterations 25 --emulate-outliers
    where:
    Environment Variable
    Description
    TCC_MEASUREMENTS_BUFFERS=Approximation:25
    Enable a measurement buffer for the
    Approximation
    measurement instance to hold 25 measurements (same as the number of
    tcc_single_measurement_sample
    iterations).
    TCC_MEASUREMENTS_DUMP_FILE=log.txt
    Store the measurement buffer in
    log.txt
    after the program finishes.
    TCC_MEASUREMENTS_TIME_UNIT=ns
    Use nanoseconds as the time unit for stored measurements.
  2. View the full measurement results after the sample finishes execution. Example dump file
    log.txt
    :
    # cat log.txt Approximation: 33918, 376, 205, 205, 206, 32140, 366, 230, 215, 205, 32011, 386, 213, 210, 207, 32041, 349, 205, 211, 214, 32047, 343, 208, 226, 213

Buffer Size Influence

When
TCC_MEASUREMENTS_BUFFERS
and
TCC_MEASUREMENTS_DUMP_FILE
are used, data is stored in a ring buffer and dumped to the specified file.
If the buffer size specified in
TCC_MEASUREMENTS_BUFFERS
is bigger than or equal to the number of iterations, all measured values will be stored and printed to the dump file.
If the buffer size specified in
TCC_MEASUREMENTS_BUFFERS
is smaller than the number of iterations, old values in the buffer will be overwritten, and only the last measured values equal to
TCC_MEASUREMENTS_BUFFERS
will be printed to the dump file. For example, if the number of iterations is 25 and
TCC_MEASUREMENTS_BUFFERS
is 3, only the last 3 measured values will be printed to the dump file.

Learn More

To learn more about the environment variables used in this sample, see Control Data Collection.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.