Highly Accurate CPU Time Data Collection
Configure the
Intel® VTune™
on Windows* OS to get highly accurate CPU time data in the user-mode sampling and tracing results.
Profiler
By default, the
VTune
detects CPU time based on the OS scheduler tick granularity. As a result, the CPU time values may be inaccurate for targets that execute in short quanta less than the OS scheduler tick interval (for example, frame-by-frame computation in video decoders).
Profiler
Accurate collection of CPU time information is available for the
user-mode sampling and tracing analysis types (Hotspots and Threading) and enabled by default in the predefined analysis configurations when you run both the
VTune
and your application to analyze with
administrator privileges.
Profiler
To collect more accurate CPU time information, the
VTune
uses the Event Tracing for Windows* (ETW) capability. For example, without ETW, a sample is taken every 10ms. For each sample, the OS is queried for the amount of time the thread executed and the difference is calculated between the samples, resulting in the delta. The information returned by the OS via this mechanism has a coarse granularity. Profiler
VTune
totals the deltas and displays it in the user interface. However, with ETW enabled, the Profiler
VTune
can filter out any time spent executing other threads and accurately calculate time for monitored threads within each 10ms sample based on the context switch information acquired from ETW. Based on this additional information, the CPU time metric calculated for the function/thread will be more accurate.
Profiler
VTune
needs exclusive access to the Microsoft* NT Kernel Logger. Therefore, only one Profiler
VTune
collection can run in this mode on the system and no other tools can use the service. If the Profiler
VTune
cannot get access to the NT Kernel Logger, the collection will continue with this mode disabled.
Profiler
This type of collection takes more processing time and disk space.
VTune
may generate up to 5 MB of temporary data per minute per logical CPU depending on the system configuration and the profiled target.
Profiler
Enabling or disabling the accurate CPU time collection depends on what is executing on the system during data collection and the structure of your application. In specific cases, there may be about a 3% variation between "normal" and "highly accurate" CPU time. But, there are corner cases where the difference could be as high as 30% or 40%. If the thread is executing, but happens to be inactive every 10ms that a sample is taken without ETW, the results would grossly misrepresent the execution time. Or, if the thread is mostly inactive, but runs exactly on the frequency of the 10ms samples, it may appear to consume large amounts of time, when in reality it does not. The best thing to do is to test it yourself, if possible. That is, collect the Baic Hotspots data with and without this option on and compare the resulting data. This can tell you if running without the highly accurate CPU time option produces results accurate enough to direct your optimization efforts, or if you need to have Administrative privileges so that you can enable this option. However, if you are restricted from using highly accurate CPU time because of your corporation's policies, you can, in general, be confident that analysis of your application's performance is valid using "normal" Hotspots data collection.
To disable highly accurate CPU time collection for custom analysis:
- Create a new custom analysis (based on an existing configuration such as Hotspots or Threading).
- Deselect theCollect highly accurate CPU timeoption.