The tree is a frequently-used data structure in computer science because it makes it possible to arrange data hierarchically. Everybody uses tree structures every day without even realizing it, such as when you're working with the files on your computer.
With a file-system tree structure, there’s the root of the hard disk, which contains several “children” (the folders), which in turn contain children of their own (subfolders), and so on until you reach the “leaves” (the files themselves). That example should give you an intuitive idea of one of the advantages of using trees: accessing the parts of a balanced tree is a lot faster than if all the parts were scattered around.
The nodes of a tree can have a different number of branches. For example, when there is a maximum of two branches, we speak of a binary tree, while zero or four branches represent a quaternary tree (quadtree), and finally, zero or eight branches represent an octal tree (octree).
But where do tree structures come into play here? Well, up until now, we’ve used voxels by placing them on a regular grid, which in fact wastes an enormous amount of data by encoding empty space. Octrees enable more efficient use of memory space by using the finest-grained resolution only where it’s necessary. Thinking in three dimensions isn’t the easiest thing in the world, and it’s not easy to represent in an article, so we’ll start by presenting the idea two-dimensionally.
Here’s an approximation of a circle, on a grid with a resolution of 12 x 12. The approximation is very rough because the resolution is too low, but as you can see, a large number of cells are blank and thus useless. If we use a quadtree, here’s what we get:
Building the quadtree is simple. You start with the initial image and subdivide it in two in both directions, which produces four quadrants. When a quadrant is either empty or completely filled, the algorithm stops there. If the quadrant is only partially filled, then the quadrant is subdivided in two again, and so on. The algorithm stops when all the quadrants are homogeneous (that is, uniformly empty or filled) or more traditionally when a given depth is reached (in the example above, we stopped at a tree depth of four, which is a division by 16 in each dimension). As you can see, even with our basic example, the end result is a little more faithful to the initial circle and yet we’ve used less data (97 nodes, or “cells” if you prefer, in this case as opposed to 122 with the regular grid). An octree is the simple extension of this technique into three dimensions.
In practice, the gain in memory space for a given definition is a little less than you might imagine at first. This is because in the case of a regular grid, the position of the voxels is implicit. Conversely, in the case of an octree, each node must maintain a “link” to each of its children. In practice, each node has to have eight pointers in addition to the color and normal of the voxel.
But that’s only a slight disadvantage compared to the many other advantages of octrees. To get a good understanding of the more important contributions of octrees, we first have to describe the way the data structure is displayed. There are several ways to display voxels, but the technique chosen by id Software is ray casting. Here’s a description.
Like ray tracing, ray casting is based on emitting or “casting” rays for each pixel of the image. But where it differs is that as soon as an intersection is found, the algorithm stops there and casts no secondary rays.
Consequently, ray casting is faster than ray tracing because, as we showed in our earlier article, these secondary rays are what introduce problems related to their memory accesses. Another advantage is that calculating the intersection of rays with voxels is much faster than with triangles. And what’s more, there’s no need to build an additional data structure to speed up these intersection calculations. The octree is both the data (geometry and textures) and the acceleration structure.