Memory Usage View
Analyze Topology, Memory, and Cross-Socket Bandwidth
- All client platforms
- Server platforms based on Intel® microarchitecture code name Skylake, with up to four sockets.
View Performance Metrics by Memory Objects (Linux* targets only)
- Dynamicmemory objects are allocated on heap using themalloc,new, and similar functions. Such objects are identified by the line where an allocation happened; for example, a source line where themallocfunction was called.
- Globalobjects are global or static variables. Such objects are identified by the module and variable name, for example:libiomp5.sp!_kmp_avail_proc (4B), where 4B is an allocation size.
- Stackobjects are local variables.VTunedoes not recognize individual variables, so all references to stack memory are associated with one memory object namedProfiler[Stack].
Identify Code Sections and Memory Objects Inducing Bandwidth
Analyze Bandwidth Issues Over Time
Identify Code and Memory Objects with NUMA Issues
- Memory Bound>DRAM Bound>Local DRAMmetric shows a fraction of cycles the CPU stalled waiting for memory loads from the local memory.
- Memory Bound>DRAM Bound>Remote DRAMmetric shows a fraction of cycles the CPU stalled waiting for memory loads from the remote memory.
- Memory Bound>DRAM Bound>Remote Cachemetric shows a fraction of cycles the CPU stalled waiting for memory loads from the remote socket cache.
- LLC Miss Count>Local DRAM Access Count,LLC Miss Count>Remote DRAM Access Count,LLC Miss Count>Remote Cache Access Count- metrics show the number of accesses to local memory, remote memory and remote cache respectively.
- Select the../Function / Memory Object /..grouping level (theFunctiongranularity should precede theMemory Objectgranularity) in theBottom-upwindow.
- Expand a function and double-click a memory object under this function.TheSource/Assemblywindow opens displaying metrics per function source lines where accesses to the selected memory object happened.