High-Tech And Vertex Juggling - NVIDIA's New GeForce3 GPU

Anti-Aliasing - Removing The 'Jaggies'

Anti Aliasing has been a big topic in the 3D-scene last year when 3dfx put all its marketing force behind this single remarkable feature of Voodoo5. NVIDIA made the mistake of joining the anti aliasing world, providing a rather sad solution that was eating up 3D-performance like crazy. While 3dfx's T-buffer solution was certainly more efficient than NVIDIA's super sampling method, both performed too poorly to make it a really noteworthy feature. People who want high frame rates would never enable anti aliasing, simply because the performance impact is not acceptable.

If you should not know what the term 'Anti Aliasing' actually means I must ask you to please follow this link . I explained full scene anti aliasing in my GeForce2 article from April 2000.

The annoying stair step effect of the aliasing is typically seen in areas where two triangles intersect that don't have the same surface angle. Today you will never see the 'jaggies' within textures or on homogenous surfaces, because those areas are at least bilinear if not even trilinear filtered.

The only currently used trick to remove the stair step effect and smoothen those hard transitions is to use some kind of filtering method for complete frames. This could of course be seen as some major waste of effort, because most of the frame doesn't show any stair step effects and so it doesn't really require any filtering. However, since it is impossible to know where those hard transitions will actually appear, as several disappointing implementations of edge anti-aliasing have shown in the past, there is no choice but to filter the whole frame.

What does the term 'filtering' actually entail in 3D? To ensure that I don't blow this article totally out of proportion I will not go into the details of bilinear, LOD, trilinear and anisotropic filtering, but simply say this: filtering is achieved by involving more than one spot in the color calculation of a pixel. Simply said, filtering requires the involvement of neighboring 'structures' into the calculation of the color value of a pixel. This is what full scene anti-aliasing does.

Super-Sampling Anti-Aliasing

Now the difficult thing about filtering is that it requires the accuracy of a sub-pixel level to look any good. So far there are two ways known for anti-aliasing that achieve this sub-pixel level accuracy. The most commonly used technique is super-sampling. The idea behind super-sampling is very simple, but also very costly. The actual frame gets rendered to a resolution that is higher than the required screen resolution. In my example case of 4x super sampling the resolution is four times as high as the screen resolution, meaning it has twice the number of pixels in x as well as in y-direction. This rendering to 4x resolution has the effect that each pixel on the screen is initially represented by four pixels in the back buffer, each being of quarter size of the screen pixel. This way sub-pixel level is reached and the filtering of those four pixels generates the anti-aliased screen pixel. The result looks good, but the poor 3D-chip has to render four times the screen resolution, which costs a huge amount of fill rate, leading to rather bad 3D-performance. Due to the filtering stage, super-sampling anti-aliasing provides even lower frame rates than what would be achieved if you'd run the game at 4x the resolution without anti-aliasing. Super-sampling AA is today used by all the GeForce chips as well as ATi's Radeon. You can only use it at resolutions up to 800x600 (in case of 4x AA the frame is rendered to 1600x1200!), because otherwise games are pretty much unplayable.

Multi-Sampling Anti-Aliasing

The alternative to super sampling is multi sampling. This is the technique used by GeForce3, but a special version of it has previously been used by 3dfx. The idea of multi-sampling AA is to render multiple samples of a frame, combining them at sub-pixel level and then filter those sub-pixels to achieve anti-aliasing. Theoretically there isn't any fill rate benefit of multi-sampling AA, because 4x AA would also require that 4 sub-pixels get rendered for each final pixel, having the same requirements in terms of fill rate as super-sampling. However, multi-sampling is a bit less expensive in terms of performance than super-sampling, because it doesn't waste any energy into the useless creation of detail for higher resolution rendering.

The reason why 3dfx's implementation was superior to NVIDIA's super-sampling method in terms of performance is because Voodoo5 used some slightly strange and proprietary way to filter the sub-pixel samples. The filtering wasn't done with any kind of computation, but the samples were overlaid at sub-pixel level within the RAMDAC-unit of Voodoo5. This also explains why it was impossible to take screen shots of it. The filtered frame was never represented anywhere in the frame buffer memory. One can think about this weird technology what one wants, Voodoo5 benefited from the missing performance impact of the filtering stage and was therefore able to produce higher frame rates, while giving respectable anti-aliasing results on the screen.

GeForce3's New High-Resolution Anti-Aliasing (HRAA)

NVIDIA's new GeForce3 is also using multi-sampling anti-aliasing and different to the previous GeForce chips it has the whole technology for this anti-aliasing method hardwired inside. GeForce3 is also creating multiple samples of a frame, storing them in a certain area of the frame buffer and before the frame gets flipped the HRAA-engine filters the samples and stores the result in the back buffer. This makes the anti-aliased frame fully software accessible and allows screen shots.

Quincunx !

Besides the normal AA-modes of 2x and 4x, GeForce3 is also offering a special AA-mode with the hilarious name 'Quincunx'. What a name! 'Quincunx' doesn't stand for some strange birth defect, but for a very nifty super-sampling trick. Quincunx generates the final anti-aliased pixel by filtering 5 pixels of - and that's the trick - only TWO samples. The effect of 'Quincunx' is an anti-aliasing effect that comes close to the quality of 4x AA, but it only requires the generation of two samples. 'Quincunx' is the reason why GeForce is indeed able to provide excellent AA-performance with good AA-quality. Here is how it works:

The 3D-scene is rendered normally, but the Pixel Shader is storing each pixel twice in two different locations of the frame buffer. This doesn't cost more rendering power than the rendering without AA, but twice the memory bandwidth of the pixel write operation at the end of the pixel rendering process.

By the time the last pixel of the frame has been rendered, the HRAA-engine of GeForce3 virtually shifts the one sample buffer half a pixel in x and y direction. This has the effect that each pixel of the 'first' sample is surrounded by four pixels of the second sample that are 1/ SQR(2) pixel away from it in diagonal direction. The HRAA-engine filters over those five pixels to create the anti-aliased pixel. As already said, the performance of this Quincunx-AA is excellent and the quality very good.