Clock Synchronization
By default, Intel® Trace Collector synchronizes the different clocks at the start
and at the end of a program run by exchanging messages in a fashion similar to the
Network Time Protocol (NTP): one process is treated as the master and its clock
becomes the global clock of the whole application run. During clock synchronization,
the master process receives a message from a child process and replies by sending
its current time stamp. The child process then stores that time stamp together with
its own local send and receive time stamps. One message is exchanged with each
child, then the cycles starts again with the first child until
SYNC-MAX-MESSAGES
have been exchanged between
master and each child or the total duration of the synchronization exceeds
SYNC-MAX-DURATION
.Intel® Trace Collector can handle timers which are already synchronized among all
process on a node (
SYNCED-HOST
)
and then only does the message exchange between nodes. If the clock is even
synchronized across the whole cluster (SYNCED-CLUSTER
), then no synchronization is done by Intel® Trace
Collector at all.The gathered data of one message exchange session is used by the child processes to
calculate the offset between its clock and the master clock: it is assumed that the
duration of messages with equal size is equally fast in both directions, so that the
average of local send and receive time coincides with the master time stamp in the
middle of the message exchange. To reduce the noise, the 10% message pairs with the
highest local round-trip time are ignored because those are the ones which most
likely suffered from not running either process in time to react in a timely fashion
or other external delays.
With clock synchronization at the start and the end, Intel® Trace Collector clock
correction uses a linear transformation; that is a scaling local clock ticks and
shifting them, which is calculated by linear regression of all available sample
data. If the application also calls
VT_timesync()
during the run, then clock correction is done with a
piece-wise interpolation: the data of each message exchange session is condensed
into one pair of local and master time by averaging all data points, then a
constrained spline is constructed which goes through all of the condensed points and
has a contiguous first derivative at each of these joints.VT_timesync
int VT_timesync(void)
Description
Gathers data needed for clock synchronization.
This is a collective call, so all processes which were started together must call
this function or it will block.
This function does not work if processes were spawned dynamically.
Fortran
VTTIMESYNC(ierr)