Sign in with
Sign up | Sign in

GF100: The Chip Based On Fermi

Nvidia’s GF100: Graphics Architecture Previewed, Still No Benchmarks
By

The dichotomy that is AMD’s and Nvidia’s respective design principles persists in 2010.

The former stands by its “sweet spot” strategy, whereby a reasonably-sized GPU (if you can call a 2.15 billion transistor chip reasonable) serves to address what we’d call the high-end market, while derivatives cover the price segments below. Addressing the more demanding enthusiast community involves multi-GPU configurations—this generation’s example is the dual-GPU Radeon HD 5970.

Meanwhile, Nvidia has another behemoth on its hands. Though the two companies almost certainly count transistors differently, GF100 is said to consist of more than three billion of them, up from the GT200’s 1.4 billion. There’s no word yet how Nvidia plans to implement lower-cost versions of its Fermi architecture—all of the details being released now center on one specific chip—but as you’ll see, the design is deliberately modular. So, whereas all of the GeForce GTX 200-series boards employed one (expensive) GPU, there’s a better chance that this time we’ll see Nvidia do some cutting on lower-end versions.

As with ATI’s Radeon HD 5000-series cards, Nvidia employs TSMC’s 40nm manufacturing process, which has thus far struggled to reach the yield levels needed for AMD to meet its demand. It’ll be interesting to see if the fab’s teething pains affect Nvidia similarly.

And given Nvidia’s cautionary note on power, it’s a fairly safe bet that dual-GPU versions a la GeForce GTX 295 will make way for dual-card SLI configurations instead. Not that we expect Nvidia to need a card with two GF100s on-board. Should the company achieve ~2x the performance of GeForce GTX 285 in today’s games (and it’d appear that, given improvements to texturing/AA, GF100 will see scenarios in excess of a 2x boost), it’ll already be competing against Radeon HD 5970 using one graphics processor.

The Building Blocks

So, why exactly might we suspect GF100 of outperforming its predecessor by such a compelling margin? It’s largely a matter of comparing architectures. Fortunately, the GF100 design is derived from GT200, which itself was derived from the almost-infamous G80/G92. If you’re already familiar with Nvidia’s previous-generation designs, understanding its latest should be somewhat straightforward.

The fundamental building block remains the stream processor, marketed now as a CUDA core. GF100 boasts 512 of these CUDA cores versus GT200’s 240. Thus, clock for clock, we’re looking at the potential for 2.13x the performance of GeForce GTX 285, assuming no other optimizations. However, Nvidia was aware of GT200’s weaknesses in designing GT100, and it claims those have been addressed here with a bit of architectural shuffling. In reality, Nvidia says it’s seeing performance in today’s titles roughly two times higher than GT200 with 8xAA enabled.

Four GPCs in a GF100, each with four SMsFour GPCs in a GF100, each with four SMs

The GPC

GT200 sports 10 of what Nvidia calls Texture Processing Clusters (TPCs), each armed with three Streaming Multiprocessors (consisting of eight stream processors and eight texture address/filtering units). That fundamental organization evolves this time around to include a more elegant collection of resources, from the fixed-function raster engine to as many as four of those Streaming Multiprocessors.

These blocks of logic are divided into Graphics Processing Clusters (GPCs), displacing the TPC concept by integrating functionality that previously existed outside the TPC. Now, one GPC is armed with its own raster engine interfacing with up to four SMs, each SM sporting 32 CUDA cores and four dedicated texture units (along with what Nvidia claims as dual schedulers/dispatchers and 64KB of configurable cache/shared memory). GF100, in its fully-operational Death Star configuration, is equipped with four GPCs.

The SM: 16 of these in a full GF100The SM: 16 of these in a full GF100

By the numbers, GT200 actually has more texturing units than GF100 (eight per TPC, up to 10 TPCs per GPU versus four texturing units per SM, with up to 16 SMs). However, the focus here is on augmented efficiency: each texture unit computes one address and fetches four samples per clock. As a result, GF100 achieves higher real-world performance, according to Nvidia.

Scheduling Via GigaThread

The GPCs are fed by Nvidia’s GigaThread Engine. Made kid-friendly by Nvidia’s marketing team, the engine is GF100’s scheduler, responsible for assigning work to each of the chip’s 16 SMs. Yet it establishes itself as a significant component of the Fermi architecture due to its ability to create and dispatch thread blocks in parallel, rather than the one-kernel-at-a-time approach taken before.

Of course, the GigaThread engine fetches its data from the frame buffer. At first blush, the six 64-bit controllers (totaling 384-bits) seems narrower than GT200’s eight x 64-bit (512-bit total) configuration. However, Nvidia is utilizing GDDR5 this time around, yielding a substantial bandwidth increase, despite the less-complex interface. Assuming the same 1,200 MHz DRAMs AMD is using on its Radeon HD 5870, a GF100-based card would have access to 230.4 GB/s of throughput versus the Radeon’s 153.6 GB/s.

ROP Performance

The back-end of GF100 is organized into six ROP partitions able to output eight 32-bit integer pixels at a time. This compares favorably to GT200’s eight blocks capable of four pixels per clock. Nvidia maintains one 64-bit memory controller per block, but realizes an overall increase from 32 pixels per clock to 48. Perhaps you noticed that, in our Radeon HD 5870 coverage, improvements to ATI’s anti-aliasing performance over its previous-generation hardware. Meanwhile, the GT200-based GeForce GTX 285 took a more substantial hit as you cranked up AA.

On the right: Nvidia's 32x CSAA with eight multi-samples and 24 coverage samples.On the right: Nvidia's 32x CSAA with eight multi-samples and 24 coverage samples.

This is another area where Nvidia sought to improve with GF100. If you own a card like ATI’s Radeon HD 5870 or are planning to buy something based on GF100, and are running on one display, then you’re enabling whatever detail settings you can in order to utilize the GPU’s massive performance. To this end, GF100 supports a new 32x coverage sampling anti-aliasing (CSAA) mode that Nvidia demonstrated smoothing out banding issues in foliage generated using alpha textured billboards in Age of Conan. And as a result of its optimizations, Nvidia is claiming a less-than 10% performance hit in going from 8x multi-sampling to 32x CSAA.

Ask a Category Expert

Create a new thread in the Reviews comments forum about this subject

Example: Notebook, Android, SSD hard drive

Display all 120 comments.
This thread is closed for comments
Top Comments
  • 26 Hide
    decembermouse , January 18, 2010 3:34 AM
    I feel like you left some info out, whether you just never read it or didn't mention it for fear of casting doubts on GF100... I've heard (and this isn't proven) that they had to remove some shaders and weren't able to reach their target clocks even with this revision (heard the last one didn't cut the mustard which is why they're hurrying the new one along and why we have to wait till March). Also, be careful about sounding too partisan with Nvidia before we have more concrete info on this.

    And yes, it does matter that AMD got DX11 hardware out the gate first. Somehow, when Nvidia wins at something, whether that's being first with a technology, having the fastest card on the market, or a neato feature like Physx, it's a huge deal, but when AMD has a win, it's 'calm down people, let's not get excited, it's no big deal.' The market and public opinion, and I believe even worth of the company have all been significantly boosted by their DX11 hardware. It is a big deal. And it'll be a big deal when GF100 is faster than the 5970 too, but they are late. I believe it'll be April before we'll realistically be able to buy these without having to F5 Newegg every 10 seconds for a week, and in these months that AMD has been the only DX11 player, well, a lot of people don't want to wait that long for what might be the next best thing... all I'm trying to say is let's try not to spin things so one company sounds better. It makes me sad when I see fanboyism, whether for AMD, Intel, Nvidia, whoever, on such a high-profile review site.
  • 25 Hide
    cangelini , January 18, 2010 3:44 AM
    dingumfOh look, no benchmarks.


    *Specifically* mentioned in the title of the story, just to avoid that comment =)
  • 24 Hide
    randomizer , January 18, 2010 3:05 AM
    GF100 is entering the ranks of Duke Nukem Forever. We keep seeing little glimpses but the real thing might as well not exist.
Other Comments
  • 24 Hide
    randomizer , January 18, 2010 3:05 AM
    GF100 is entering the ranks of Duke Nukem Forever. We keep seeing little glimpses but the real thing might as well not exist.
  • 23 Hide
    duckmanx88 , January 18, 2010 3:29 AM
    dingumfOh look, no benchmarks.


    wth is he supposed to benchmark? nothing has been released it's just an article giving us details on what we can expect within the next two months.
  • 26 Hide
    decembermouse , January 18, 2010 3:34 AM
    I feel like you left some info out, whether you just never read it or didn't mention it for fear of casting doubts on GF100... I've heard (and this isn't proven) that they had to remove some shaders and weren't able to reach their target clocks even with this revision (heard the last one didn't cut the mustard which is why they're hurrying the new one along and why we have to wait till March). Also, be careful about sounding too partisan with Nvidia before we have more concrete info on this.

    And yes, it does matter that AMD got DX11 hardware out the gate first. Somehow, when Nvidia wins at something, whether that's being first with a technology, having the fastest card on the market, or a neato feature like Physx, it's a huge deal, but when AMD has a win, it's 'calm down people, let's not get excited, it's no big deal.' The market and public opinion, and I believe even worth of the company have all been significantly boosted by their DX11 hardware. It is a big deal. And it'll be a big deal when GF100 is faster than the 5970 too, but they are late. I believe it'll be April before we'll realistically be able to buy these without having to F5 Newegg every 10 seconds for a week, and in these months that AMD has been the only DX11 player, well, a lot of people don't want to wait that long for what might be the next best thing... all I'm trying to say is let's try not to spin things so one company sounds better. It makes me sad when I see fanboyism, whether for AMD, Intel, Nvidia, whoever, on such a high-profile review site.
  • 8 Hide
    megamanx00 , January 18, 2010 3:37 AM
    Well, not much new here. I wouldn't really be surprised if the 2x performance increase over the GTX285 was a reality. Still, the question is if this new card will be able to maintain as sizable a performance lead in DX11 games when Developers have been working with ATI hardware. If this GPU is as expensive to produce as rumored will nVidia be able to cope with an AMD price drop to counter them?

    I hope that 5850s on shorter PCBs come out around the time of the GF100 so they can drop to a price where I can afford to buy one ^_^
  • 25 Hide
    cangelini , January 18, 2010 3:44 AM
    dingumfOh look, no benchmarks.


    *Specifically* mentioned in the title of the story, just to avoid that comment =)
  • 20 Hide
    randomizer , January 18, 2010 4:10 AM
    cangelini*Specifically* mentioned in the title of the story, just to avoid that comment =)

    You just can't win :lol: 
  • -5 Hide
    sabot00 , January 18, 2010 4:14 AM
    Finally some solid info on GF100.
  • 5 Hide
    tacoslave , January 18, 2010 4:27 AM
    Even though im a RED fan im excited because its a win win for me either way. If amd wins than im proud of them but if nvidia wins than that means price drops!!! And since they usually charge more than ati for a little performance increase than ill probably get a 5970 for 500 or less (hopefully). Anyone remember the gtx280 launch?
  • 7 Hide
    Reynod , January 18, 2010 5:07 AM
    Chris your review was unusually kind.

    I'd rank it up there with Anand's on the first Phenom iteration - he had ES well before the others and there was mounting pressure to at least publish something ... and the AMD fanbois should consider tha article very fair.

    I had heard Nvidia were booting some silicon and the clocks were low ... and in order to get within the power elvelope it was likely some SP's would have to be shaved ... that's about all anyone can say.

    I imagine NVidia will also be concentrating on ensuring the die is securely attached to the substrate.

    They won't want to cheese off the OEM's like last time:

    http://www.theinquirer.net/inquirer/news/1050052/nvidia-chips-underfill
  • 5 Hide
    falchard , January 18, 2010 5:13 AM
    One thing I wonder is if nVidia finally pushes forward with this architecture, does this mean developers will finally start utilizing some of the tech ATI has had in its cards for generations? For instance, will they utilize more efficient poly rendering effectively making ATI cards perform 300% faster in drawing polies and make every consumer nVidia card before the GF100 moot?

    Also will they adopt a naming convention that finally makes sense? Up to 9000, reset, skip double digits and 100, go straight to 200. Now go back to 100. I mean seriously who comes up with these names?
    G80, G92, G200, GF100..
  • 12 Hide
    Kelavarus , January 18, 2010 5:27 AM
    One thing you didn't mention about the Supersonic Sled Tech Demo there is that it took three GF100s in a triple-SLI configuration to do that.
  • 2 Hide
    TheGreatGrapeApe , January 18, 2010 5:28 AM
    Chris, some 'leaked' 'internal' nV slides recently appeared with THG results from the HD5970 review, since I can't ask the question I would like yo about that (there's no way you could answer if true), I'll simply ask, were you aware of this?

    http://news.softpedia.com/newsImage/Alleged-GeForce-GTX-360-and-380-benchmarks-Surface-3.jpg/

    Slight tweaking of the RE:5 results (likely because they didn't point in the right direction for the existing cards) :evil: 

    And Charlie's recent 'Pro-nVidia' article is somewhat telling about the possibility of scaling downward, what's your opinion on it if you can say, other than "Charlie's just being Charlie". ;) 

    http://www.semiaccurate.com/2010/01/17/nvidia-gf100-takes-280w-and-unmanufacturable
  • 1 Hide
    aggrressor , January 18, 2010 5:31 AM
    umm, Guys, If you want benches - They are "kind of" available at guru3d. I have just read their article, and while it's a bit too technical for my taste, they've recorded a Far Cry 2 bench at Nvidia conference on a crappy camera. The end result was 50 FPS on GTX285 vs 84 FPS on GF100 based product. Now I know it's not raw numbers or charts or anything like that, but at least that gives me a rough idea of what GT300 stuff will be like.
  • 7 Hide
    randomizer , January 18, 2010 5:32 AM
    dingumfThis is the end of the NDA. Do you even know what NDA is kid?

    Do you? The end of an NDA does not mean every detail has to be divulged. You can still only provide the details that have been given to you. If NVIDIA don't hand out the review samples, you can't benchmark them. It's not rocket science!
  • 2 Hide
    masterjaw , January 18, 2010 5:33 AM
    How long should we wait before we actually see an article like "Alas! Fermi has arrived (late?)".

    If they claim that it is "significantly faster" then better it would be or else..
Display more comments