Game Engine Optimization Tools: Unleash the power with Intel® Graphics Performance Analyzers

This demo features the Intel® Graphics Performance Analyzers (Intel® GPA), and the new Graphics Trace Analyzer that are part of Intel® GPA. These powerful tools can hep you boost game performance and FPS.

Hi. I'm Seth Schnieder here at GDC. We're here on the show floor and I'm going to talk to you guys about Trace Analyzer. So Trace Analyzer is one of our brand new products we put in the GPA product suite. Here we're going to be looking at an application that is either CPU or GPU bound. So let's dive right in. 

So if we look here, we have our application. It's rolling here. Basically it's just a car sitting. What we're going to do is we're going to actually take a look at this application and see whether we are CPU or GPU bound. 

So first of all, if we just kind of roll the car around here, we'll see that there are some particle effects that come out of the wheels here. There they are. And we notice that our cloth optimization is on. So if we turn off the cloth optimization, we'll notice that our performance slows kind of to a halt, right? We go down to about 10 to 12 frames per second. 

So at this point as a developer, we'd be wanting to know, are we CPU or are we GPU bound? So let's take a look at the Google Trace Analyzer and then take a look and see if we actually are CPU or GPU bound. So for the sake of the demo, we already captured our traces, so let's move up and look at Intel GPA Graphics Trace Analyzer to capture these two traces. And what these traces are going to show us is exactly how CPU or GPU bound we are. 

So first let's look at our first cut. This is with the cloth optimizations on. We're going to look at three rows within this. Essentially what those captures did is it took five seconds' worth of data and then compiled the trace. And then now we're viewing that trace over that five seconds in our race analyzer. 

So let's look at the top row here. The top row shows how busy our CPU was, right? So this is the kernel looking at context switches. So what we've noticed here is that we've stripped out our alien processes and we can only see the packets that we're executing on the CPU for our application. 

What we can then show here is our GPU queue. We can notice that this is how often work was being executed on the GPU. And then after that we can look and see our CPU submission queue. This is how often the CPU was submitting work to the GPU. 

All that to say, that was really complicated to show this application appears to be GPU bound. Because if we noticed here, there's no gaps in our GPU queue. If we'd noticed gaps in the GPU queue, that means the GPU was stalled. And we don't want that when we're in a graphics application, we always want our GPU to be executing. 

So if we then take a look, we can also see in our CPU context switches that there are gaps in the CPU context switches. So that means that the CPU has worked on other applications other than the application that we're currently executing. 

So let's go ahead and take a look at the second trace. Now, the second trace looks a lot different, right? So we just talked about the CPU context switches, and now we can see that our application is being processed on three of the cores almost 100% of the time. This means that our application is most likely CPU bound. Because then when we look at our GPU queue right here, we can see that there are large gaps. So then once we notice that our GPU queue has about 80% of stall time in it, we know that we are in fact CPU bound. 

If you're interested in Intel GPA you can download it for free and find out more information in the description below.