We’ve remained quiet on this issue until now because we wanted to be absolutely clear about the situation before weighing in. I love to use automotive metaphors, so I’m going to apply one here to describe this situation because I think it helps put the whole mess in perspective:
You’re a muscle-car buff and you decide to test drive the new 2015 Dodge Charger Hellcat. The car is advertised as a supercharged 8-cylinder, 6.2 Liter Hemi engine with 24 valves that produces 707 horsepower at 6,000 RPM. It’s one of the most powerful cars you can buy for the dollar, achieving 0-60 MPH in under three seconds and a quarter-mile in under 12 seconds. You take it for a test drive, you fall in love with the car, and you buy it. In the months to follow, you remain quite pleased with your purchase and the performance the car provides.
It later comes out that Dodge made a mistake on its marketing materials: the engine has 16 valves, not 24. It still produces 707 horsepower at 6,000 RPM though, and it still offers the same amazing road performance that it did the day you bought it. It’s still one of the fastest cars you could purchase for the dollar. But you can no longer say you own a 24-valve V8.
It’s upsetting. But does it make the Charger Hellcat a worse car that it was before you found out? Practically, no, as performance is unchanged. But it leaves a bad taste in your mouth. For Dodge, it would be a PR nightmare. Some Charger Hellcat owners would feel lied to, despite the fact that their car is every bit as fast as they expected it to be in the first place.
This is essentially the kind of misrepresentation scenario that Nvidia is dealing with. Introducing the actual GeForce GTX 970 specifications:
|GeForceGTX 980||GeForce GTX 970 (actual)||GeForce GTX 970 (originally reported)|
|GPU||GM204 (Maxwell)||GM204 (Maxwell)||GM204 (Maxwell)|
|L2 Cache||2 MB||1.75 MB||2 MB|
|Core CLock||1126 MHz||1050 MHz||1050 MHz|
|Memory Clock||1750 MHz GDDR5||1750 MHz GDDR5||1750 MHz GDDR5|
|Memory Bandwidth||224 GB/s||196 GB/s (3.5 GB)28 GB/s (512MB)||224 GB/s|
|Max. TDP||165 Watts||145 Watts||145 Watts|
|Aux. PowerConnectors||2x Six-pin PCIe||2x Six-pin PCIe||2x Six-pin PCIe|
This issue reared its head when some users noted that in certain cases the GeForce GTX 970 reported 3.5GB of graphics memory despite being sold as a 4GB card. It turns out that this is a symptom of memory segmentation, as the card splits the 4GB into a 3.5GB high-priority segment and a 512MB low-priority segment. Yes, the card has access to a full 4GB of RAM. It is accessed differently and with different theoretical bandwidth, but it's all there and available. It remains to be seen if this technique causes any notable performance detriment in a real-world scenario, but from what we've seen so far, it does not.
But why is there an odd split in memory resources in the first place? It turns out, this is a clue that hints at the real issue: Nvidia’s incorrect reporting of the GeForce GTX 970’s technical specifications and GM204 GPU resources.
Note that one of the four ROP partitions is not fully enabled, but partially disabled. That partially disabled partition is responsible for the strange 3.5GB/512MB memory split, as memory controller resources are linked through them. With one-eighth of the ROP partitions disabled, one-eighth of the memory (512MB) must be accessed in a special way, through the working half of the partition. In this way, all 4 GB of RAM is usable, although segmented. As a result the 3.5 GB portion can be accessed with 196 GB/s of bandwidth, while the 512 MB portion has 28 GB/s of bandwidth available. There are probably some particular situations that will expose this weakness, but it is surprisingly difficult to create this scenario. We will keep an eye out for it and be able to call it out now that we know the actual specifications, but none of this new information invalidates the benchmark results we've already collected.
In addition, an eighth of the L2 cache is not used. As a result, there are 56 functional ROPs in the GeForce GTX 970, and the chip has access to 1.75 MB of L2 cache. This is less than the 64 ROPs and 2 MB of L2 that Nvidia originally indicated in its press materials.
How Did This Happen?
The company claims this is the result of a misunderstanding on the part of the technical marketing team, which was not aware of the partially-enabled ROP cluster, and that the issue was not identified internally until this month when people began digging into the memory issue.
It’s times like these that conspiracy theories begin to fly. In this case it’s understandable: how does a highly technical team of GPU experts, one that works directly for the chipmaker, make such a huge mistake?
I’ve never been a conspiracy theorist: I believe human greed and stupidity are simple explanations for all that’s wrong in the world, and that we don’t need to concoct a shadowy Illuminati organization to take the blame. I also don’t see a lot of motive for Nvidia to open themselves up to this kind of PR nightmare in the first place. There is absolutely no advantage or benefit to have done so. Occam's razor points to a screw-up, plain and simple.
It does surprise me that one of Nvidia's engineers never spotted this major snafu after reading a few launch reviews. But if you were that guy, would you even want to report that a huge product launch suffered from flubbed specs? If you were made aware of this inconsistency, would you offer the information to the world after a successful launch was behind you, or keep it to yourself and hope that the press would never find out?
Does It Matter?
I’m not omniscient. I don’t know if Nvidia knew about this and chose to keep it close to its chest, or if the company found out with the rest of us last Friday. But I have no good reason to believe the company is lying. To tell the truth, from a practical standpoint, I’m not sure if it matters.
That’s not to say I think it’s OK to be fed misinformation about GPU specifications. This is a subject I am patently passionate about. I think it’s important for the technical press to have the right information at its disposal when analyzing hardware and deciphering the repercussions of what goes on under the hood. Nvidia needs to work hard to make sure this kind of mistake never happens again.
But from a purely practical standpoint, this doesn’t really change anything for the end user. The GeForce GTX 970 remains one of the best graphics card buys on the market. It performs the same way it did at launch — which is really good. As such, we will continue to recommend it until there is a better-performing option for the price.
We can empathize with buyers who feel betrayed, though. Nvidia definitely has some mind-share to earn back. But to us the price/performance ratio trumps everything else, and that is no different today than it has been since the GeForce GTX 970 was released.