GPU / Sampler : Slice <N> Subslice<M> Sampler Input Available
Percentage of time there is input from the EUs on slice ‘N’ and subslice ‘M’ to the sampler.
GPU / Sampler : Slice <N> Subslice<M> Sampler Output Ready
Percentage of time there is output from the sampler to EUs on slice ‘N’ and subslice ‘M’.
- Application:Unreal Engine 4* Sun Temple sample, DirectX SDK* CascadedShadowMaps11 sample
- Tool:Intel® GPAGraphics Frame Analyzer
- Operating System:Windows* 10
- GPU:Intel® Processor GraphicsGen9 and higher
- API:DirectX* 11
Optimize Sampler Bottleneck with
Graphics Frame Analyzer
- Reduce the texture size.
- Change a filtering mode.
- Choose a texture format with a smaller amount of data for a pixel or an uncompressed texture format, if possible. In some cases, the uncompressed format may cause a new bottleneck for larger textures.
- Reduce the number of surfaces on the screen where the texture is applied.
- Adjust the sampling access pattern to make an access to the texture more linear.
Reduce Texture Size
- Open the event with the discovered Sampler bottleneck in theGraphics Frame AnalyzerResource Viewer by selecting this event on theMainbar chart.
- Click theShow All Resourcesbutton, and then click theTexturestab to open the list of sampled textures.
- Reduce the size of one or more large textures. For example, the marble texture size is 1024x1024 pixels. Select a smaller size, for example 256x256, and then click the button.
- Compare the original and the resulting textures:Original:Result:Difference:
Change Filter Parameters in Pixel Shader
- Open the event with the discovered Sampler bottleneck in theGraphics Frame AnalyzerResource Viewer by selecting this event on theMainbar chart.The pink segment contains the texture and shadow rendering. Shadow properties are set in the pixel shader.
- Select the Shader resource in theResource List, and then choose thePixelshader type. The pixel shader contains theCalculatePCFPercentLitmethod with m1 and m2 values, which represent the iteration range in the filter loop.m1 and m2 formulas:m1 = m_iPCFBlurSize / -2m2 = m_iPCFBlurSize / 2 + 1,wherem_iPCFBlurSizeis the kernel size. The initial kernel size is 9, m1 = -4, and m2 = 5.
- Reduce the kernel size to 3, set m1 to -1 and m2 to 2.The metrics values are improved, but the Sampler is still a bottleneck.
- Check the extreme condition by setting the kernel size to 1, m1 to 0, and m2 to 1.