Toward Faster Memory
For a long time Nvidia was the precursor when it came to using the latest memory technologies. After the DDR memory used for its GeForce in 2000, the Santa Clara firm was the first to use GDDR2 with the GeForce FX, then GDDR3 with the GeForce 5700. But for awhile now, ATI has been acting as the pioneer — GDDR4 first appeared on its Radeon X1950XT, and now, two years later, it has begun to offer the first card to use GDDR5: the Radeon HD 4870.
There’s no secret when it comes to increasing memory bandwidth. There are two ways to do it: The first consists of widening the data bus, and the second is to make the memory operate faster. The first method comes up against numerous obstacles. A wider bus makes routing on the PCB more complex, and also requires a larger number of pins on the package. Then all those pins have to be connected to the chip, which requires a large number of pads (the interconnections placed around the periphery of the chip). So, a wider bus requires that the die be of a certain size – which is one reason why entry-level GPUs were limited to 128-bit buses for a long time, while their high-end equivalents used a 256-bit or 384-bit bus. Another disadvantage is that it takes more power, which increases the chip’s consumption.
So, it’s no wonder that the option was used parsimoniously. In fact, 128-bit buses were used for a long time on high-end GPUs, from the Riva 128 to the Matrox Parhelia, and the ATI Radeon 9700 four years later. In the same way, the 256-bit bus didn’t get wider until the arrival of Nvidia’s GeForce 8800 late in 2006. And yet, the bandwidth demands of GPUs are constantly increasing, despite the technologies for saving bandwidth that have been perfected with each generation.
So, the solution lies in running memory faster. That’s easier said than done, however, because as with any circuit, there’s a limit to the clock frequency at which memory chips can operate. To get around the limitations, manufacturers have used various tricks. DDR memory enabled transfer of data on both the rising and the falling edges of the clock cycle, doubling the data rate for a given memory frequency. To do that, DDR memory uses what is called a two-bit prefetch — at each memory access, instead of transferring one bit from the prefetch buffers, the DDR memory transfers two. Successive developments in DDR technology have consisted of moving more and more data at a given memory frequency by increasing the width of the prefetch. DDR2 used a 4-bit prefetch, like GDDR3. GDDR4 introduced an 8-bit prefetch.