Skip to main content

24 Pipelines of Power! NVIDIA 7800 GTX

NVIDIA Gets Shady With CineFX 4.0

From a layman's point of view, the block diagrams of G70's data paths do not look that different from those of the NV40, other than the extra shading units. But when you take a deeper look, it is clear to see that the G70 was designed to handle more mathematical calculations per clock.

Part of the goal of the 7800 GTX was to raise minimum frame rates to make the gaming experience fun consistently, not just some of the time. To accommodate the higher overhead needed for many of today's and tomorrow's games, additional Arithmetic Logic Units (ALU) were required to boost the performance in the 3D rendering pipeline.

From this block diagram, you can see where the ALUs were added to the pixel shading pipeline. Each mini-ALU contains a multiply-add (MADS) instructions set. NVIDIA claims that vertex shader performance has increased "up to 30%" in the scalar ops because of the single-cycle MADs. With each clock, four floating point MADS can be performed at full speed.

Pixel ShaderVertex Shader
Architecture2x Vector-4 + Scalar + NormVector-4 + Scalar
Vector4 MAD / 8 flops
Scalar2 flops
Instructions / ALU52
Operations / ALU105
Flops /ALU2710
Instructions / Clock12016
Operations/ Clock24040
Flops / Clock64880
Clock Frequency430 MHz430 MHz
Instructions / Second51.6B6.88B
Operations / Second103.2B17.2B
Floating point operations / Second278.6B34.4B
Bilinear Filtered Textures per clock24
Bilinear Texel Fill Rate10.3B
Texture Bandwidth (FB + PCI-E)44.4 GB/s

Additionally, NVIDIA says it has made many other improvements throughout the lower-level pipeline. Two of these improvements show up on the bottom line in benchmark scores and real world performance - lowering latencies and increasing computational capabilities per clock. If everything works together with less delay, then the whole system will benefit.