Fillrate Tester Results
The Geometry Shading performances of the previous Nvidia Direct3D 10 GPU weren’t especially impressive, due to under-dimensioned internal buffers. Remember that according to Direct3D 10 a Geometry Shader is capable of generating up to 1,024 single-precision floating-point values per incoming vertex. Thus, with significant amplification of geometry, these buffers were quickly saturated and prevented the units from continuing calculation. With the GT200 the size of these buffers has been multiplied by a factor of six, noticeably increasing performance in certain cases, as we’ll see. To make the most of the increase in the size of these buffers, Nvidia also had to work on the scheduling of Geometry Shading threads.
On the first shader, Galaxy, the improvement was very moderate – 4%. On the other hand, it was no less than 158 with Hyperlight – evidence of the potential improvement with this type of shader, everything being dependent on their implementation and their power consumption (number of floating points generated per incoming vertex). So the GTX 280 has closed the gap and edges the 3870 X2 for this same shader.
Now let’s look at the Rightmark 3D Point Sprites test (in Vertex Shading 2.0).
Why are we talking about this test in the section on Geometry Shaders? Simply because since Direct3D 10, point sprites are handled by Geometry Shaders, which explains the doubling of performance between 9800 GTX and GTX 280!
Nvidia has also optimized several aspects of its architecture. The post-transformation cache memory has been increased. The role of this cache is to avoid having to retransform the same vertex several times with indexed primitives or triangle strips by saving the result of the vertex shaders.Due to the increase in the number of ROPs, performance in Early-Z rejection has been improved. The GT200 is capable of rejecting up to 32 masked pixels per cycle before applying a pixel shader. Also, Nvidia announces that they’ve optimized communication of data and commands between the driver and the front end of the GPU.