Use the Time Line Viewer pane to view and configure time line tracks of the traced processes. Each track comprises the following areas:
- CPU tracks
Visualize aggregated CPU activity reflecting thread execution on CPU cores. Use to analyze thread execution order, execution duration, and distribution between CPU cores. Blocks of the same color represent the same thread. If you zoom into the trace, you can see the process name the thread belongs to and thread execution duration.
- GPU queues
Show GPU queues for all active video adapters generating graphics content. GPU queues can be of different types depending on the application. The Time Line Viewer pane visualizes GPU utilization over time: execution of the DMA packages on GPU. DMA package color corresponds to the color of the thread, from which the DMA package was submitted. All DMA packages have names visible on mouse hover. Additionally, DMA packages, which are essential for analysis, are marked with different icons depending on their type:
- Render package. Render package with a present call is crosshatched.
- Signal package
- Wait package
- Paging package
Selecting any of these packages shows an arrow that reveals calls in CPU threads related to that package. For example, you can trace the origin of a Render package from the CPU thread, to the User-Mode Driver, and up to the hardware queue.
- Flip queues
Shows flip queues for all active video adapters. Flip queue reflects the relationship between the application present calls, present packages of GPU/CPU queues, and Vertical Synchronization (VSync) event of the monitor. Flip queue package consists of two blocks: solid-color and crosshatched. Solid-color block shows the time when Desktop Window Manager (DWM) is generating the content that needs to be displayed. The crosshatched block shows the time during which the content waits for the moment to be displayed (VSync).
For applications utilizing layered flip queues, multiple layers can be displayed as sub-tracks of the general Flip Queue track.
- CPU queues
Shows CPU queues for all processes generating graphics content. CPU queue represents ordered command packages to be executed, but not yet submitted to GPU. CPU queues can be of different types depending on the application. CPU queue package color corresponds to the color of the thread, from which the CPU queue package was submitted. All CPU packages have names visible on mouse hover. Additionally, CPU packages, which are essential for analysis, are marked with different icons depending on their type:
- Render package. Render package with a present call is hatched for packages named Present Token and crosshatched for packages named Present Render.
- Signal package
- Wait package
- Paging package
CPU synchronization is represented as WaitForSingleObject, WaitForMultipleObjects, SetEvent, and ReleaseSemaphore function calls on the time line track of the threads. You can use these captured events to profile different synchronization issues.
Click on any of these events to visualize dependencies between synchronization events in the form of arrows. For example, an arrow pointing from a SetEvent call in one thread to a WaitForSingleObject call in a different thread indicates that a thread was unblocked by a SetEvent call in another thread.
On CPU queues tracks, areas where a thread was active are highlighted as green bars. Areas where a thread was idle are shown as gray bars.
A thread is considered active and is highlighted accordingly if this thread was actively executed on a CPU core at this time.
To see OpenCL™ API calls on CPU tracks, enable OpenCL domain in Options
Shows events as markers that have a timestamp, but do not have duration. Markers scope can be global or process-defined. Markers are visualized as colored triangles. Global markers are placed on the Time Line ruler; process-defined markers are shown on executed threads.
Shows any GPU/CPU metrics, which you enabled in System Analyzer or System Analyzer HUD.
Shows regions, which are logical application execution blocks, defined with Instrumentation and Tracing Technology API (ITT API) or graphics debug API. By default, block color corresponds to the color of the time line track. Each block has a name and duration.
- Threads track
Shows executed threads of profiled processes. Use this data to analyze your application performance and behavior based on ITT API and system events. Each colored block represents a logical block of application execution marked up by the user or generated by the system. Each block has a name and duration, and can have nested blocks. By default, block color corresponds to the color of the time line track.
- Parallel Execution track
Parallel Execution track visualizes how the driver parallelizes execution of submitted render events.
- OpenCL Execution tracks
OpenCL Execution tracks visualize execution of OpenCL kernels on a GPU or a CPU. To see the dependency between the tasks of submitting and executing a particular kernel, click any OpenCL packet or OpenCL API call on the CPU Thread track.
The data is useful to spot synchronization issues or understand whether there is a problem in OpenCL code if you use different APIs. For example, if OpenCL Execution track is fully loaded, you can detect a problematic kernel for detailed profiling with Intel® VTune™ Profiler.
To see OpenCL Execution tracks, enable OpenCL domain in Options.
- Shader Breakdown tracks (Windows only)
Shader Breakdown tracks visualize workload distribution among shaders and asynchronous compute work for floating-point unit and extended math pipes.
Shader Breakdown track contains two types of bars:
- Bars on the top show the percentage of time when the Execution Unit (EU) was performing asynchronous compute work from the compute queue.
- Stacked bars show the percentage of time per shader when the EU was executing shader instructions from the main graphics queue.
The track contains the data for the following types of shaders:
- Compute shader (CS)
- Vertex shader (VS)
- Pixel shader (PS)
- Other shaders, for example, mesh or ray tracing shaders
Hover over the bar to see the percentage above the track.
To see Shader Breakdown tracks, enable AsyncCompute metric set and select any GPU sampling interval except Frame in Options.
- From the 9th to the11th generation Intel Core™ processor family, only shader workload distribution data is available in Shader Breakdown tracks. Asynchronous compute data is available starting with the 12th generation Intel Core™ processor family.
- Families of Intel® Xe graphics products starting with Intel® Arc™ Alchemist (formerly DG2) and newer generations feature GPU architecture terminology that shifts from legacy terms. For more information on the terminology changes and to understand their mapping with legacy content, see GPU Architecture Terminology for Intel® Xe Graphics.