Time Stamping
Intel® Trace Collector assigns a local time stamp to each event it records. A time
stamp consists of two parts which together guarantee that each time stamp is
unique:
Clock Tick
counts how often the timing source incremented since the start of
the run.Event Counter
is incremented for each time stamp which happens to have the
same clock tick as the previous time stamp. In the unlikely situation that the event
counter overflows, Intel® Trace Collector artificially increments the clock tick.
When running Intel® Trace Collector with VERBOSE
> 2
,
it will print the maximum number of events on the same clock tick during the whole
application run. A non-zero number implies that the clock resolution was too low to
distinguish different events.Both counters are stored in a 64-bit unsigned integer with the event counter in the
low-order bits. Legacy applications can still convert time stamps as found in a
trace file to seconds by multiplying the time stamp with the nominal clock period
defined in the trace file header: if the event counter is zero, this will not incur
any error at all. Otherwise the error is most likely still very small. The accurate
solution however is to shift the time stamp by the amount specified as event bits in
the trace header (and thus removing the event counter), then multiplying with the
nominal clock period and 2 to the power of event bits.
Intel® Trace Collector uses 51 bits for clock ticks, which is large enough to count
2
51
ns, which equals to more than 26 days before the counter overflows.
At the same time with a clock of only ms resolution, you can distinguish 8192
different events with the same clock tick, which are events with duration of 0.1
μs.Before writing the events into the global trace file, local time stamps are
replaced with global ones by modifying their clock tick. A situation where time
stamps with different local clock ticks fall on the same global clock tick is
avoided by ensuring that global clock ticks are always larger than local ones. The
nominal clock period in the trace file is chosen so that it is sufficiently small to
capture the offsets between nodes as well as the clock correction: both leads to
fractions of the real clock period and rounding errors would be incurred when
storing the trace with the real clock period. The real clock period might be hard to
figure out exactly anyway. Also, the clock ticks are scaled so that the whole run
takes exactly as long as determined with
gettimeofday()
on the master process.