Creating a game that performs well across multiple platforms with different hardware specifications can be a daunting task. Multiple variables can cause performance issues that may lead to unsatisfactory gameplay.
This document is a high-level overview aimed at helping game developers quickly get proficient with Intel® Graphics Performance Analyzers (Intel® GPA). We explain the basics, such as capturing a trace, stream, or frame, and determining if your game is GPU- or CPU-bound. We also provide additional links to help you go deeper into the tools.
Terminology and Tools
- Graphics Monitor: The hub tool for the Intel GPA analysis tools. Allows you to configure capture options for capturing a trace, stream, or frame.
- Graphics Trace Analyzer: Used to identify problems with distributing the GPU and CPU resources and application data between functions. Visualizes all types of activity with your game from CPU thread activity to GPU calls and execution.
- Graphics Frame Analyzer: Used to explore the captured frame and understand the performance impact of specific API calls at various stages of the rendering pipeline. You can decrease frame-rendering time, see into draw-call issues, and understand how they impact the frames-per-second (FPS) at various stages of the rendering pipeline.
- CPU-Bound: When the CPU is busy all the time, and the GPU has idle spots then you are time-bound by the CPU. The GPU cannot do more work because the CPU is too busy to assign it more work in that frame or frames.
- GPU-Bound: Conversely, if the GPU is busy all the time and the CPU has idle spots, then the GPU is holding you back. Optimize the work on the GPU so the CPU can assign more work to the GPU.
- Trace: When Graphics Trace Analyzer captures a trace, you have a record of activity on both the CPU and the GPU during application execution. The default is five seconds of data capture.
- Frame: A single frame with its associated resources, textures, shaders, buffers, and other data.
- Stream: A collection of captured frames.
Begin Performance Analysis by Following These Steps
Step 1: Capture a trace with Graphics Monitor. The trace shows dependencies between CPU calls and GPU packets, among other things.
Step 2: To determine if your game is GPU-bound or CPU-bound, analyze the trace with Graphics Trace Analyzer.
Step 3: If you are GPU-bound, use Graphics Monitor to capture a stream. This gives you the in-depth frame details.
Step 4: Analyze your captured stream or frame with Graphics Frame Analyzer.
Step 5: If your game is CPU-bound, use Intel® VTune™ Profiler to begin analysis of CPU bottlenecks.
Typically, you know where significant performance hits reside in your game. If you are not sure, you can use game engine tools that display FPS counters, or use the FPS counter built into the capture window heads-up display (HUD) in Graphics Monitor.
First, find the areas in your game where your frame-rate drops or where you want to improve visual fidelity, and capture those areas using Graphics Monitor. After that, look at the visualizations of GPU and CPU activity in Graphics Trace Analyzer. Once you identify where the biggest issue lies, use either Graphics Frame Analyzer to dig into the details of GPU bottlenecks. You can also use Intel VTune Profiler to dig into CPU bottlenecks.
Step 1: Capture a Trace with Graphics Monitor
Note: Enable developer mode in Windows* to successfully capture metrics data with Graphics Monitor.
While Intel GPA cannot collect all metrics from all third party GPUs, it can collect a majority of metrics from other GPU vendors.
Graphics Monitor is the hub tool for Intel GPA analysis tools. In Graphics Monitor, you select options and settings for your trace and stream captures.
Graphics Monitor has three different capture options: trace, stream, and frame. After you have captured one of these options, an icon displays on the right-hand side of Graphics Monitor representing your capture and type of capture. The following three images show three different capture types.
Figure 1. Trace capture
Figure 2. Stream capture
Figure 3. Frame capture
Capturing a trace in Graphics Monitor allows you to get a visual perspective of what's happening over time in GPUs and CPU cores once you open the trace in Graphics Trace Analyzer.
In Graphics Monitor, you can modify the time captured in seconds.
Figure 4. Select the Graphics Monitor Options button
Inside Options, select the Trace tab. From here, you see the Trace Duration (sec) option, which you can set in seconds. Trace captures can get large, so you should try to keep them short.
Figure 5. Modify the Trace Duration in seconds
To capture a trace:
- Choose your game executable.
- Set any command-line arguments that your game may require.
- Choose your capture type (in this case, choose Trace).
- Select Start.
Figure 6. Begin a Trace capture
This launches your game executable with a HUD overlay. The HUD overlay displays basic metrics and key indicators. If you have chosen delayed capture, it tells you which key to press to begin your trace.
Figure 7. Game running with HUD overlay, and key-press to start trace
The following video and corresponding article guides you through capturing a trace to view in Graphics Trace Analyzer.
Step 2: Analyze the Trace with Graphics Trace Analyzer to Determine If Your Game is GPU-bound or CPU-bound
After you have identified where you want to analyze performance and have captured a trace with Graphics Monitor using the Trace type, you are ready for the next step.
Figure 8. Select the Trace icon to launch Graphics Trace Analyzer
To launch Graphics Analyzer, select the Trace icon.
Note: It might take a minute or two to launch the app. A lot of data is being loaded.
Graphics Trace Analyzer collects data from several seconds of gameplay. It shows the CPU execution tasks, the GPU rendering packets, and a visualization of the CPU and GPU activity. In a trace capture, you typically capture only three-to-five seconds of gameplay.
For more information, see the following video and corresponding article.
While in Graphics Trace Analyzer, you can zoom in on the data that you captured. Looking at Figures 9 and 10, you can see that the game at this time slice is GPU-bound. Figure 9 shows gaps in CPU execution, whereas Figure 10 shows that the GPU is constantly busy during this same time.
Figure 9. CPU Execution
Figure 10. GPU Execution
For more information on Graphics Trace Analyzer, see the following:
- Graphics Trace Analyzer Overview
- User Guide: Graphics Trace Analyzer
- Graphics Trace Analyzer In-Depth Videos
Step 3: If You Are GPU-bound, Use Graphics Monitor to Capture a Stream
A stream captures data from one or more frames—textures, buffers, shader calls, and hardware counters—that can then be analyzed to help you find the bottlenecks in your rendering pipeline so you can optimize your game.
To capture a stream (or a single frame-capture), open Graphics Monitor. Choose a capture type of stream (or frame) to analyze details of the GPU activity, such as textures, pixel history, or other resources.
Make sure you enable deferred stream capture to capture multiple streams at any point in your gameplay. Not doing this forces the capture to happen from the time that the game starts until you close the capture window. To do this, in the Options section of Graphics Monitor, enable Defer stream capture.
Figure 11. Select Options to enable deferred stream capture
Figure 12. Select the Stream tab and enable Defer stream capture
The following video and corresponding article guides you through capturing a stream so that you can view the frame detail in Graphics Frame Analyzer.
Capture a Single Frame
Capture a Stream (Recommended)
Once you use Graphics Monitor to capture a stream, you can open it in Graphics Frame Analyzer to begin analyzing your stream. To launch Graphics Frame Analyzer, in Graphics Monitor, double-click the stream.
Figure 13. Captured stream in multiframe view inside Graphics Frame Analyzer
Step 4: Analyze Your Captured Stream or Frame with Graphics Frame Analyzer
Now that you have captured a stream with Graphics Monitor, you can begin analysis with Graphics Frame Analyzer.
In Graphics Frame Analyzer, the stream opens in multiframe view where you can visualize multiframe streams. Here, you can identify single frames of interest. Select a frame to open it, and profile the frame down to the draw-call level.
The following video and article help you to become familiar with the Graphics Frame Analyzer UI.
To learn more, review the Hotspot section of the following video.
Hotspot analysis allows you to aggregate calls by type so you can optimize the most offensive, time-consuming, or problematic call-type. Using the fixing-by-type approach may optimize multiple sections of your game.
To see Advanced Profiling (Hotspot) Mode analysis in action, watch the video The Lost Legends of Redwall* & Intel Graphics Performance Analyzers. In this video, see how Advanced Profiling (Hotspot) Mode locate issues in this game, and learn how those issues were addressed—ultimately more than tripling the game’s performance.
For more information on Graphics Frame Analyzer, see the following:
- Graphics Frame Analyzer Overview
- User Guide: Graphics Frame Analyzer
- Graphics Frame Analyzer In-Depth Videos
Step 5: If Your Game is CPU-bound, Use Intel® VTune™ Profiler to Analyze CPU Bottlenecks.
To learn more about CPU-bound analysis, see Intel VTune Profiler.
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.