GeForce 8600: DirectX 10 For The Masses

TMU Tweaking

Nvidia has not been asleep behind the wheel. It was the first to introduce a DX10 capable graphics processor. The phrase "what have done for me lately" is not misplaced as we should expect a lot from the people we shell out hundreds of dollars to in computer hardware. While AMD/ATI has yet to release a single card to the market, Nvidia now has its second delivery in the form of 169 square millimeter G84 and G86 processors.

The answer to my first question is yes. This new offering delivers upgrades to the previous card. Other than the die shrink to an 80 nm process and the utilization of less silicon, the GeForce 8600 has two primary changes. One was to the 3D engine and a second was a complete overhaul to Nvidia's PureVideo processing engine. The combination of both advances suggests that Nvidia has a specified consumer target in mind for these products. On one hand, there is the introduction of DX10 to more consumers as this brings the price down $70 for the average gaming enthusiast and as much as $150 less for the budget gamer. Additionally the home theater market consumer has a much better card and at an attractive price.

Now looking past the marketing, the core difference to the 3D engine is a "tweak" to the amount of texture processing the graphics core can deliver per clock cycle. Each texture mapping unit (TMU) on the GeForce 8800 (G80) could deliver up to four texture addresses and eight filtering operations per clock. Each TMU on the GeForce 8600 can deliver twice the number of texture addresses while maintaining the same number of filtering ops (8 and 8 vs. 4 and 8). What does this mean? G80 (GeForce 8800) can deliver 64 filtering operations per clock but only 32 texture addresses per clock. G84 and G86 were built to match the texture addressing operations per clock with the existing filtering operations per clock.

Generally textures are a two-dimensional color arrays whose values are called a texture elements, or texels. Each texel has its own unique address in the texture with a numeric column and row value. This is similar to when you graphed equations in Quadrant I of a Cartesian coordinate system during geometry class.

Texture coordinates are in texture space. When a texture is applied to a primitive, the texel address is mapped to the object. These coordinates are then translated to screen coordinates or pixel location. For the Direct3D API, the mapping process is actually an inverse of this mapping where texels are mapped from texture space directly to pixels in screen space. From each pixel in screen space, the corresponding texel position in texture space can be calculated and the color at or near that point is sampled via texture filtering (Linear, Bilinear, Trilinear and Anisotropic). Consequently the coordination of building a TMU that can directly handle addressing and filtering at a 1:1 ratio could prove to be beneficial. (We already have some ideas of how to test if it is but is beyond the scope of this article.)

Below is a table containing key specifications for the existing G80 graphics processor as well as the new G84 and G86 processors.

Specification GeForce 8800 GTX GeForce 8800 GTS GeForce 8600 GTS GeForce 8600 GT GeForce 8500 GT GeForce 8400 GS GeForce 8300 GS
Processor G80 G80 G84 G84 G86 G86 G86
Fabrication Process 90 nm 90 nm 80 nm 80 nm 80 nm 80 nm 80 nm
Number of Transistors (millions) 681 681 289 289 210 210 210
Core Clock (Including dispatch, texture units, and ROP units) 575 MHz 500 MHz 675 MHz 540 MHz 450 MHz 450 MHz 450 MHz
Shader Clock (Stream Processors) 1.35 GHz 1.20 GHz 1.45 GHz 1.19 GHz 900 MHz 900 MHz 900 MHz
Stream Processors (#) 128 96 32 32 16 16 8
Memory Clock (MHz / data rate) 900/1800 800/1600 1000/2000 700/1400 400/800 400/800 400/800
Memory Interface 384 Bits 320 Bits 128 Bits 128 Bits 128 Bits 64 Bits 64 Bits
Memory Bandwidth (GB/sec) 86.4 GB/s 64.0 GB/s 32.0 GB/s 22.4 GB/s 12.8 GB/s 6.4 GB/s 6.4 GB/s
Frame Buffer Size 768 MB 640 MB 256 MB 256 MB 256 MB 128 MB or 256 MB 128 MB or 256 MB
ROPs (#) 24 20 8 8 8 4 4
Texture filtering rate (texels per clock) 64 48 16 16 8 8 8
Texture Fill Rate (Billions of bilinear filtered texels/sec) 36.80 GT/s 24.00 GT/s 10.80 GT/s 8.64 GT/s 3.60 GT/s 3.60 GT/s 3.60 GT/s
HDCP Support Yes Yes Yes Optional Optional Optional Optional
RAMDACs 400 MHz 400 MHz 400 MHz 400 MHz 400 MHz 400 MHz 400 MHz
Bus Technology PCI Express 1.1a PCI Express 1.1a PCI Express 1.1a PCI Express 1.1a PCI Express 1.1a PCI Express 1.1a PCI Express 1.1a
Available From Retail Retail Retail Retail Retail OEM OEM
Create a new thread in the US Reviews comments forum about this subject
This thread is closed for comments
No comments yet
Comment from the forums
    Your comment