Memory Bandwidth: Analysis And Summary
First, we plotted the frames per second achieved by each card. Then, we used the benchmark results with no anti-aliasing as our 100% reference point and calculated the percentage of this for the other settings. This illustrates how much of their original performance the cards lose when anti-aliasing is applied and no CPU or GPU limitation is in play.
Performance at 1920x1080
Performance at 2560x1440
The varied benchmark results published by different sites prove that, in many games, more demanding presets are more or less able to conceal the GeForce GTX 660 and 660 Ti's bandwidth disadvantage. Because its GPU is inherently less powerful, though, the GeForce GTX 660's memory bottleneck is less likely to be felt.
How about the relationship between the GK104- and GK106-based GeForce GTX 660s? Both slot in somewhere behind AMD's Radeon HD 7870. The gap between them grows with increasingly more intensive levels of anti-aliasing.
Four observations surprised us. First, we witnessed very linear scaling among AMD's Radeon boards as we increased AA settings. Second, it's impressive to see the GeForce cards hold their own, even at higher resolutions, providing they’re not forced to contend with 4x MSAA or more. The third surprise was that two Radeon HD 7750s in CrossFire were able to outpace the GeForce GTX 660 and 660 Ti. Finally, we were surprised to dig up an OEM card based on a pared-down GK104 GPU, although we weren't pleased with what we found. A higher shader count doesn't compensate for a lower core clock. So, without overclocking the board Nvidia is selling to OEMs turns out to be slower than retail cards with the same name.
So what does this tell us?
Nvidia's GeForce GTX 660 is most competitive when its GPU is taxed, deemphasizing its narrower memory interface. At stock settings, the OEM-only GK104-based GeForce GTX 660 is considerably slower than the GK106-based retail model. Not surprisingly, then, we have to take serious issue with Nvidia’s naming scheme, since buying a tier-one box with a GeForce GTX 660 is going to stick you with a much less powerful product and no intuitive way to see that.
Does the same handicap apply in compute-oriented workloads?