Anomaly Detection View
- Code regions of interest
- Information about regions where simulations executed faster or slower than normal
Load Details for Slow Region
- Switch to theBottom-upwindow.
- Group results byCode Region of Interest / Duration Type.
- To further examine the outliers in the Slow region, right click on the Slow field and selectLoad Intel Processor Data by Selection.
Compare Processor Trace Details
Instructions Retired, Call Count, Total Iteration Count
Control flow metrics.
Instructions Retiredrefers to the number of entries into a kernel.
CPU Time (Kernel and User)
Active time on the CPU.
Wait Time, Inactive Time
Duration for which a thread was idle because of synchronization or preemption.
Latency (Wall-clock time of the code region execution).
- Context Switch Anomaly
- Kernel-Induced Anomaly
- Frequency Drops
- Control Flow Deviation Anomaly
Context Switch Anomaly
- In theIntel Processor Trace Detailswindow, check theInactive TimeandWait Timemetrics. TheWait Timeindicates the duration for which a thread was idle due to synchronization issues.
- If the metrics are zero, the application had no context switches. Proceed to check for a different type of anomaly.
- If the metrics are non-zero, continue with this procedure to check for context switches.
- Sort the data byWait Time.
- For the instances that had significantWait Time, compare theWait TimewithElapsed Time. If the thread was idle for a considerable portion of elapsed time, this was due to a context switch synchronization issue. In this example,thread 25883was idle for 1.269 out of 1.318 milliseconds, which is about 96% of the time.
- Expand the instance to drill down to a function or a stack. Identify the stack(s) that brought the thread to an idle state.
- In theIntel Processor Trace Detailswindow, sort the data byKernel Time. The topmost element of the stack points to the entry point into the kernel. Where the ratio of kernel time to Elapsed Time is high, a significant amount of time was spent in the kernel. In this example, 566 out of 997 microseconds were spent in the kernel for the highlighted thread.
- Expand the thread to see contributing stacks that could be responsible for long kernel times.
- Bottom-up window:Shows frequency information for the entire application.
- Intel Processor Trace Details window:Shows frequency information only for the loaded region.
- There are Intel® Advanced Vector Extensions (Intel® AVX) instructions used inside or outside a loaded code region.
- There are underlying hardware issues like cooling.
- Apart from your application, low activity on the core and OS can also cause frequency drops. Look for high numbers ofInactive TimeorWait Time.
Control Flow Deviation Anomaly
- Select a node in the grid where you see a high value forInstructions Retired.
- Right-click and selectFilter In by Selectionfrom the context menu.
- Switch to theCaller/Calleewindow.In the flat profile view, you can see functions annotated with Self and Total CPU Times. The caller view shows the callers of the selected function in a bottom-up representation. The callee view shows a call tree from the selected function in a top-down representation.
- Check theTotal Iteration Countto compare the number of loop iterations between a fast and slow iteration.
- If the slower iteration has a higher iteration count, switch toSource Assemblyview and examine the source code of the function.
- Check to see if the slower iteration passed the validation of the cached element.
- Increase the cache size.
- Update cache data and repeat the analysis.