The Radeon HD 4000 Series
That’s far from being a dumb move on AMD’s part. Everyone knows that the bulk of sales isn’t with high-end cards that sell for between $623.61 and $935.39, but with “affordable” ones priced between $233.85 and $467.69. Still it’s a risky move. Though it’s true that card makers earn most of their money with entry-level and mid-level cards, the high end acts as their technological showcase. It’s easier to sell the GeForce 8600 when the 8800 is leading in all the benchmarks than to sell Radeon HD 2600s with the HD 2900’s poor reputation sticking to your shoes, regardless of the intrinsic qualities of the mid-level cards. But before we worry about the future success of this generation for AMD, let’s take a closer look at what the architecture offers.
Radeon HD 4000
Card | HD 4850 | HD 4870 |
---|---|---|
GPU clock frequency | 625 MHz | 750 MHz |
RAM clock frequency | 993 MHz | 900 MHz |
ALUs | 800 | 800 |
Texture units | 40 | 40 |
ROPs (Raster Operation units) | 16 | 16 |
Memory controller | 256 bits (8 channel 32 bits) | 256 bits (8 channel 32 bits) |
RAM type | GDDR3 | GDDR5 |
A new record! With its 160 five-way VLIW shader units (800 ALUs in all), the RV770 dethrones the GT200, with its 993 Mflops to become the first GPU to pass the very symbolic bar of 1 Tflop (1 Tflop for the HD 4850 and 1.2 Tflops for the 4870). But what’s really impressive is for a GPU with a die measuring barely 260 mm² to achieve such figures.
But the pleasant surprises don’t stop there. AMD took advantage of the new architecture – finally, we’re tempted to say – to increase the number of texture units! There are now 40 units, as opposed to only 16 with the R420. Even if that’s still far from Nvidia’s 80 units, the increase is appreciable. In fact, AMD hasn’t abandoned its principles. The number of texture units has increased in proportion to the number of ALUs: Compared to the RV670’s 64 processing units, the RV770 has 160 – a multiplication of processing power by a factor of 2.5 –, and from 16 texture units, the RV670 has been multiplied to 40, by the same factor. So AMD feels that the ratio between arithmetic instructions / texture instructions of 4:1 introduced with its previous architecture was a good balance, and has used it again on this new GPU.
If you compare the competing Nvidia architecture, you see that despite the rebalancing done for the new GT200, the RV770 is still much more at ease with a high number of arithmetic operations. The ratio of processing power/number of texels filtered on this latest GPU from AMD is 40:1 compared to approximately 20:1 for its competitor. Let’s test the differences with our usual theoretical benchmarks. (Note: While the Radeon HD 4870 is the card that would have been best suited for the synthetic tests we use to analyze the architecture, it was unavailable due to the sloppy handling of this launch, with its release dates. So we had no choice but to run them with the HD 4850, and so the performance results are a little less flattering to AMD).
As we did with the GT200, we’ll start off nice and easy, using a version of RightMark 3D with only Pixel Shader version 2.0. The HD4850 actually beat the GTX 280 on the PS2.a and 2.0a tests, but was less at ease with the PS2.0 ones. While it’s easy to understand that the RV770 is better suited for advanced lighting models, we expected it to do better in procedural shader tests where it should be able to take advantage of its enormous processing power.