Over-clocking the GeForce256


It started all with the 3Dfx Voodoo chip, when performance hungry users found out that a 3D-chip can be overclocked just as a processor. The idea of overclocking is evidently to improve performance, and this is what was achieved by altering the clock speed of Voodoo, Voodoo2, Rendition V2k, TNT and many more. The last generation of 3D-chips was even officially distinguished by different clock speeds, so that overclocking of 3D-chips has become an almost common thing. With this experience in mind it doesn't surprise that NVIDIA's new GeForce256 would be the next overclock-victim and we at Tom's Hardware decided to not only crank up the core and memory clock of the chip, but also evaluate the effects of those procedures in detail.


A - Memory Bandwidth

Last week you learned from our article, Full Review NVIDIA's new GeForce256 'GPU' , that GeForce comes in two flavours, either equipped with conventional 'SDR' ('single data rate') or with 'DDR' (double data rate') memory. The latter can transfer data at the rising as well as at the falling edge of the memory clock, which results in a data bandwidth twice as high as SDR-RAM at the same clock speed. GeForce's data path to its local memory is 128 bit wide, just the same as found in most other mainstream 3D-chips today. 128 bit are 16 Byte, the SDR-board has a memory clock of 166 MHz and the DDR-board runs the memory at 150 MHz (doubled to '300 MHz' in official papers, so that everyone understands that it's DDR), thus the SDR-board has got a memory bandwidth of 16 Byte x 166 millions/s = 2.6 GB/s and the DDR board offers no less than 16 Byte x 150 millions/s x 2 = 4.8 GB/s. Those two numbers stand against a memory bandwidth of TNT2-Ultra that is 2.9 GB/s and of G400 that is 3.2 GB/s. You can see that the SDR-GeForce has less memory bandwidth than TNT2-Ultra or G400, which is rather strange, considering that it's a new generation 3D-chip. It's also pretty logical, that overclocking the memory on a GeForce w/SDR will certainly have to improve its performance; otherwise NVIDIA wouldn't equip this 3D-chip with DDR-memory as well. We wondered if only GeForce w/SDR would benefit from overclocking its memory, or if even the board equipped with DDR-memory will still show improved performance after we increased its memory bandwidth to over 4.8 GB/s.

B - Rendering Pipeline

GeForce's core clock is currently 'only' 120 MHz. This is a lot less than what we are used to in high-performance 3D-chips, but we should not forget that GeForce is a lot more complex than its predecessors, especially due to its integrated T&L-engine. GeForce has twice the amount of rendering pipelines than the last generation of high-end 3D-chips, which is another reason why 120 MHz might be good enough for the time being. Let's have a quick look at the numbers: 4 pipelines that can render one single textured pixel per clock each, running at 120 MHz, results in a theoretical fill rate of maximal 480 Mpixels/s. The (rather stupid term) 'Mtexels/s' is identical with also 480 Mpixels/s. Please spare me from explaining why, the unit 'texels/s' was invented by 3Dfx to market their Voodoo2-chip and the world would do a lot better without it. Just keep in mind that 'texels/s' is only for marketing guys or other people who talk a lot without much knowledge, technicians and engineers detest it.

Anyway, 480 Mpixels/s are not quite as much as we would expect from a ground breaking new 3D-chip, so that we should expect a performance improvement of GeForce after raising the core clock over 120 MHz. Having said that, I'd like to remind you that a 3D-pipeline can only render a pixel per clock if the data supplied by the geometry engine as well as by the local memory is there and ready. GeForce's pipeline could be nastily stalled by its memory bandwidth, particularly in case of the SDR-board, so that we might never even see the 480 Mpixels/s. In this case we can raise the core clock as much as we want and we won't get much performance increase at all.

C - T&L-engine

We are talking about GeForce, the ground breaking new 'GPU', so that we should not possibly forget that overclocking the core will also improve the speed of the transform and lighting engine. The question is if we will see any improvement in any of the current game applications. Even if the T&L-engine is used by the game, we still don't know if the rendering pipeline, the memory bandwidth or the T&L-engine is limiting the game's frame rate. We will try to shed some light into this complex issue with this article.