The upcoming release of Mitsuba will incorporate a few rather big changes under the hood. The most significant one is that a fairly large piece of code, the kd-tree construction and traversal logic, has been rewritten from scratch.
There were a few reasons for doing so: the old code produced reasonable kd-trees, but it was quite slow, used excessive amounts of memory, and it only ran on a single processor. Of those three, the memory requirement was perhaps the most problematic: if each triangle causes a temporary overhead of about 20x of its storage cost, big scenes become a problem.. For example, my laptop would just crash when I tried to load the 28 million triangle Lucy mesh from Stanford.
Part of the problem lies with the commonly used algorithm for constructing kd-trees: it represents shapes using abstract “bounding box events”. Even using various optimizations, these events alone take up so much memory that one might run out of it without having generated a single kd-tree node.
So what has changed?
- The new implementation uses an approximate kd-tree building strategy (“Min-Max binning”) to generate the top of the kd-tree, which avoids creating massive amount of bounding box events. Once geometry has been partitioned into groups of less than <64K elements, the more accurate traditional O(N log N) builder takes over. An extra benefit of this approach is that it leads to much more coherent memory access patterns.
- The build process now uses a custom memory allocator, which is specifically optimized for the allocation/deallocation sequences seen during kd-tree construction.
- The construction runs in parallel: independent subtrees are assigned to different cores as they become available. This scales quite well, since most of the cost in constructing trees is at the leaves.
- A rather wasteful data structure for triangle classification has been packed so that it only needs two bits per entry.
The best part: the resulting trees are also of much higher quality than those made by the previous implementation, which gives a 20%-30% speedup essentially for free.
When running the builder on standard scenes, the final SAH costs (a kd-tree quality metric) now come within 1% of those listed in the seminal paper “On building fast kd-Trees for Ray Tracing, and on doing that in O(N log N)” by ray-tracing gods Ingo Wald and Vlastimil Havran. So as far as optimization goes, I think I’ve reached the limit of what is possible using this method.
These changes make it possible to use Mitsuba for interactive visualizations of very large scenes, such as the UNC Powerplant (a popular benchmark scene with 13 million triangles).