Minimize the number of root signature slots or descriptor sets to only what will be used by a shader.
Try to find a balance between root signature or descriptor set reuse across shaders.
For multiple constant buffers that do not change between draws, consider packing all constant buffer views into one descriptor table.
For multiple Unordered Access Views (UAVs) and Shader Resource Views (SRVs) that do not span a consecutive range of registers and do not change between draws, it is best to pack them into a descriptor table.
Minimize descriptor heap changes. Changing descriptor heaps severely stalls the graphics pipeline. Ideally, all resources will have views appropriated out of one descriptor heap.
Avoid generic root signature definitions where unnecessary descriptors are defined and not leveraged. Instead, optimize root signature definitions to the minimal set of descriptor tables needed.
Vulkan: When creating a Descriptor Set, using the BindAfterFlag bit beware that Xᵉ-LP only supports 1M Descriptors. Only Create needed descriptors (when porting from DX12, remember that D3D12_DESCRIPTOR_HEAP_TYPE_CBV_SRV_UAV maps to 7 Vulkan types [ VK_DESCRIPTOR_TYPE_SAMPLED_IMAGE, *_STORAGE_IMAGE,* _UNIFORM_TEXEL_BUFFER,*_STORAGE_TEXEL_BUFFER, *_UNIFORM_BUFFER, *_STORAGE_BUFFER])
Favor root constants over root descriptors, and favor root descriptors over descriptor tables when working with constants.
Make use of root/push constants to enable fast access to constant buffer data (they are pre-loaded into registers).
Root/push constants are great to use on frequently changing constant buffer data.
Use root/push constants for cases where the constants are changing at a high frequency.
If certain Root Signature slots are less frequently used (not referenced by a PSO), put those at the end of the root signature to reduce GRF usage
Be sure to use hints that allow the driver to perform constant-based optimizations, such as D3D12_DESCRIPTOR_RANGE_FLAG_DATA_STATIC.
For placed resources, initialize with a clear, copy, or discard before rendering to the resource. This helps enable proper compression by putting the placed resource into a valid state.
When creating resource heaps, resources that need to be accessed by the GPU should be placed in heaps that are declared as resident in GPU memory, preferably exclusively. This has a significant impact on discrete GPU performance.
Use queries to identify scenarios when GPU local memory gets oversubscribed and adjust resource location to accommodate this.