Figure 1. Example BVH-4 traversal with a short stack, showing the sorted children Q and the restart trail count at each level. i The count at level 1 is set to 4, as there are no entries pushed onto the stack at this level.ii Processing node I yields no hit children. Therefore, node J is popped from the stack, incrementing the count at level 2.iii Processing node J yields no hit children. Therefore, node K (the last node corresponding to level 2), is popped from the stack and the counter is set to 3.iv Processing node K yields no hit children. The following pop operation skips levels 1 and 2 (gray) as they indicate the last child was already traverseds and node B (level 0) ispopped from the stack.
Compressed wide bounding volume hierarchies can significantly improve the performance of incoherent ray traversal, through a smaller working set of inner nodes and therefore a higher cache hit rate. While inner nodes in the hierarchy can be compressed, the size of the working set for a full traversal stack remains a significant overhead. In this paper we introduce an algorithm for wide bounding volume hierarchy (BVH) traversal that uses a short stack of just a few entries. This stack can be fully stored in scarce on-chip memory, which is especially important for GPUs and dedicated ray tracing hardware implementations. Our approach in particular generalizes the restart trail algorithm for binary BVHs to BVHs of arbitrary widths. Applying our algorithm to wide BVHs, we demonstrate that the number of traversal steps with just five stack entries is close to that of a full traversal stack. We also propose an extension to efficiently cull leaf nodes when a closer intersection has been found, which reduces ray primitive intersections by up to 14%.
Research Area: Rendering, ray tracing, bounding volume hierarchy (BVH), GPU systems.
Published in High-Performance Graphics 2019