Check for Shader Bottlenecks
- Shader Executionbottleneck is caused by shaders that perform complex computations and are executed many times. In most cases, the bottleneck is caused by pixel and compute shaders.In this case, EU Thread Occupancy and EU Active metrics are close to 100%.To optimize the shader execution bottleneck, you can simplify a shader: reduce unnecessary computations in a shader code and avoid using complicated arithmetic functions.
- Samplerbottleneck appears when the sampler is not able to generate output data with the requested speed.In this case, there's a huge difference between Input Available and Sampler Output Ready metrics.To optimize the sampler bottleneck, you can: identify which sampling instructions take more time, optimize inefficient functions by changing resource parameters (format, size, filter parameters), or reduce the amount of sampling.
- L3bottleneck appears when graphics cache (L3) is not able read or write data with the requested speed.To optimize L3 bottleneck, you can reduce memory traffic or improve memory access patterns to Shared Local Memory (SLM), Unordered Access Views (UAV), textures, and constants.
Profile Shader Source Code
- Once you selected draw calls for analysis, choose the shader used in these draw calls from the Resource List ().Graphics Frame Analyzer displays the shader source code in theShader Editor().
- From theShader Typedrop-down list on the top right (), choose pixel or vertex shader for profiling.In most cases, the pixel shader is causing the bottleneck, but you can compareShader Invocations() metrics to understand which shader was executed more times and needs optimization.The shader code opens in the Shader Editor. For easier reading, you can click the button to indent the code, and the button to preprocess the selected shader and hide the code paths that do not get executed.
- Click theShader Profilerbutton to view performance data per shader code line.The shader viewer displays all generated versions of Gen ISA code in the drop-down menu on the top left (), and the profiling data column on the left of the shader editor ().You can choose either duration or execution count to analyze the efficiency of your shaders ().
At this step, you need to understand which parts of code take more time. Stalls may occur, for example, due to inefficient use of resources or redundant calculations.
- Duration:shows the estimated portion of time a line of code took in percent, relative to the execution time of all shader stages.
- Execution Count:shows the total number of times the exact line of code was executed.
Analyze Shader Resources
- Render Target View (RTV)
- DirectX resources:
- Shader Resource View (SRV)
- Constant Buffer View (CBV)
- Unordered Access View (UAV)
- Vulkan resources:
- Access View (UAV)
- Storage Buffer Object (SBO)
- Storage Texture
- Uniform Buffer Object (UBO)
- Vertex Buffer View (VBV)
Profile Shader Assembly Code
- Click theShader Profilerbutton to view performance data per shader code line.
- Click theShow source-assembly mappingbutton to view the source code and the assembly code side-by-side, and to map individual source or assembly lines to their counterparts.
- Compile a shader with debugging information in your application.
- For DirectX frames, apply any modification to a shader in Graphics Frame Analyzer and click the button. Your shader recompiles with debugging information directly in Graphics Frame Analyzer.
Experiment with Shader Code
- Select HLSL or GLSL from the respective drop-down menu and edit the code in the Shader Editor.The shader recompiles on the fly. If you introduced any errors, you can see the corresponding message in the Notification pane below the Shader Editor.
- Click the button to save the changes.Graphics Frame Analyzer recalculates all metrics and displays new data in the Metrics pane and in the Main bar chart.When you click the button, Graphics Frame Analyzer saves all the shaders. This enables you to write your own code and replace the whole shader to experiment.
- If you want to undo your edits, click the button.The original shaders are restored.
Evaluate Final Picture
Shows the render target with modifications.
Shows the render target without modifications.
Shows the difference between the current and the original mode.
Shows the render target with an overdraw visualization.
- For DirectX* 11 shaders without debug information, DXBC-ISA mapping is available instead of HLSL-ISA mapping.
- Source-assembly mapping is not supported for Shader Model 5 shaders on DirectX* 12 applications. To enable it, recompile the shaders for Shared Model 6.
- Shader Profiler requires Intel® Graphics Driver version 126.96.36.19955 or higher.
- This feature is supported on 9th Generation (code names Skylake, Coffee Lake, Kaby Lake), and 11th Generation (codenamed Ice Lake) Intel® Graphics hardware.