Counting Clock CyclesThe count of cycles, also known as clock ticks, forms a fundamental basis for measuring how long a program takes to execute, and as part of efficiency ratios like cycles per instruction (CPI). On Intel(R) Pentium(R) 4 and Intel(R) Xeon(R) processors, some processor clocks may stop “ticking” under certain circumstances:
The processor is halted, e.g. during I/O, there may be nothing for the CPU to do while servicing a disk read request, and the processor may halt to save power.
The processor is asleep, either as a result of being halted for a while, or as part of a power management scheme.
Note There are different levels of sleep, and in the deeper sleep levels, the processor's time-stamp counter stops counting.
These are the three mechanisms for counting clock cycles for monitoring performance:
Non-halted Clockticks – cycles that the specified logical processor is not halted
Non-sleep Clockticks – cycles that the physical processor is not in any of the sleep modes
Time-stamp Counter – cycles that the physical processor is not in deep sleep
The first two metrics use performance counters. The time-stamp counter is accessed via an instruction, RDTSC.
For applications with a significant amount of I/O, these ratios may be of interest:
Non-halted CPI: Non-halted clockticks/instructions retired measures the CPI for the phases where that CPU was being used.
Non-sleep CPI: Non-sleep clockticks/instructions retired measures the CPI for the phases where the physical package is not in any sleep mode.
Nominal CPI: Time-stamp counter ticks/instructions retired measures the CPI over the entire duration of the program, including those periods the machine is halted while waiting for I/O.
The counts produced by the non-halted and non-sleep clockticks are equivalent in most cases, if each physical package supports one logical processor. On processors that support Hyper-Threading Technology, each physical package can support two or more logical processors. Hyper-Threading Technology only provides two logical processors for each physical processor. While both logical processors can execute two threads simultaneously, one logical processor may be halted to allow the other logical processor to execute without sharing execution resources between two logical processors.
Non-halted clockticks can be qualified to count the number of processor clock cycles for each logical processor whenever that logical processor is not halted (it may include some portion of the clock cycles for that logical processor to complete a transition into a halted state).
Non-sleep clockticks is based on the filtering mechanism in the CCCR, it will continue to increment as long as one logical processor is not halted.
The time stamp counter increments whenever the sleep pin is not asserted or when the clock signal on the system bus is active. It can be read with the RDTSC instruction. The difference in values between two reads (modulo 2**64) gives the number of clock cycles between those reads.
The time-stamp counter and non-sleep clockticks counts should agree in practically all cases. However, it is possible to have both logical processors in a physical package halted, which results in most of the chip (including the performance monitoring hardware) being powered down. In this situation, it is possible for the time-stamp counter to continue incrementing because the clock signal on the system bus is still active, but non-sleep clockticks will no longer increment because the performance monitoring hardware is powered down.