Sign in with
Sign up | Sign in
Your question

RV770 has 800 SPs and 40 TMUs from diagram????

Last response: in Graphics & Displays
Share
June 17, 2008 1:53:34 AM

Saw this on VR-Zone, may be fake.
http://forums.vr-zone.com/showthread.php?t=289696

I still think RV770 will have 480 SPs and 32 TMUs, especially considering the sales web pages that were 'accidentally' posted today for the 4850. I would be awed and mystified if Ati/AMD pulled the ol' switcheroo, I guess we'll see on June 23rd. Looking forward to the RV770 launch far more than I was for today's GT200 launch.

a b U Graphics card
June 17, 2008 2:23:37 AM

I don't know. The vast majority of claims I've seen said 800, but that link made me wonder some too. If it is a real 800, NVidia better watch out.
June 17, 2008 5:31:07 AM

its 800/5 according to AMD how they calculate their shader, so its really 160 SP.
Related resources
June 17, 2008 5:51:34 AM

iluvgillgill said:
its 800/5 according to AMD how they calculate their shader, so its really 160 SP.


Sort of, based on R600/RV670 each 'super shader processing unit' contains 5 sub units which ati calls shaders, these shader units aren't the same or as elegant as an G80/G92/GT200 stream processing unit which can perform more operation types and aren't specifically dedicated. You can roughly divide by 5 to compare the architectures, however, I believe ATi's shaders can do more calculations per cycle.

R600 architecture: http://www.beyond3d.com/content/reviews/16/8
G80 architecture: http://www.beyond3d.com/content/reviews/1/8

The question is, has ATi made any significant changes to their shader processing unit architecture. Probably only optimizations and tweaks (like G80 to GT200), but we can always hope for more :bounce: 
a b U Graphics card
June 17, 2008 7:15:50 AM

iluvgillgill said:
its 800/5 according to AMD how they calculate their shader, so its really 160 SP.


No, it would be 800 SPs.

How they are utilized depends on the dipatcher and arbiter, but it's no more 160 SPUs than the GTX280 is 10 or 30 SPUs due to their sub grouping.

Let me guess you're going to start calling the GTX SPUs 'cores' now too. :sarcastic: 
a b U Graphics card
June 17, 2008 7:28:07 AM

badgtx1969 said:

I still think RV770 will have 480 SPs and 32 TMUs, especially considering the sales web pages that were 'accidentally' posted today for the 4850.


That picture didn't show me anything that looked like 40 TMUs, and I think that number is a mistake of those reverse calculating a mis-print (swapped HD4870 fillrate with 4850 freq in early specs publishing).

http://www.tomshardware.com/forum/251632-33-rv770-800sp

I'm not even sure this confusion helps ATi, it would matter more if they were closer competitors, regardless of everything there's little else nV could do in response as they're already fast-tracking the 55nm replacement because of the 65nm parts own issues only a failure would change that, slowing down the rush maybe a little.

June 17, 2008 5:08:27 PM

Ape, you are right it does look like 10 TMUs, however, a similar architecture diagram from R600 only shows 4 Texture Units. So does ATi cluster the TMU in 4's or is a TMU to them actually the texture filter units within each unit (the 4 pink/red boxes)?

Interesting to see that similar to R600, there are 80 ALUs per every 'Texture Unit'. Obviously there is no longer a ring bus w/ RV770 but there may be a crossbar/cache hierarchy that allows global data share?

R600:


Texture unit:



June 17, 2008 5:14:08 PM

wonder if this presentation shows an actual RV770 die shot, almost looks like 160 superscaler clusters

June 17, 2008 5:17:06 PM

Nobody likes a troll, i know, But.

I approve this discussion. And will read more when i get to home.




June 17, 2008 5:25:37 PM

From ee times, AMD confirms that RV770 has greater than 500 'cores'


'AMD will claim technology leadership in two areas. Its chips will use more than 500 cores, more than double the 240 cores on the new Nvidia parts. They will also use GDDR5 memory interfaces running at about 3.2 Gbits/s or more. Nvidia will use the existing GDDR3 protocol running at up to 1.1 GHz on a 512-bit interface to deliver memory bandwidth up to about 102 Gbytes/s on some versions.'


http://www.eetimes.com/news/latest/showArticle.jhtml?ar...
June 17, 2008 6:09:24 PM

JAYDEEJOHN said:
Hmmm, after reading your link Ape http://www.tomshardware.com/forum/251632-33-rv770-800sp This leaves me still guessing. Is it 640 SP's? Did they change the ratio? Or even the composition/design of the SP?


Maybe 4850 will be 640 SP and 32 TMUs from disabling 2 rows of the diagram? I am still on the hedge whether or not said diagram is real.
June 17, 2008 6:15:53 PM

I guess the SP speculaiton on rv770 still haven't crash landed
June 17, 2008 6:24:39 PM

Performance per die area.... 2.8x GTX200

The RV770 die is 250 mm^2.

How big is GTX260/280?
a c 84 U Graphics card
June 17, 2008 6:39:40 PM

^^ 576mm^2, so rv770 ends up having over 20% more performance.... yeah..right.... X2 maybe..
June 17, 2008 7:14:11 PM

And now after performance per watt, we now get performance per mm

Whats next???
June 17, 2008 7:37:02 PM

performance per TMU might look good especially if its 32 :pt1cable: 

That presentation rhetoric is typical of AMD, never a raw #s comparison.
June 17, 2008 7:56:06 PM

^Nice find, now how does the math work out for TMUs? Have to run, someone else can crunch the numbers.
a b U Graphics card
June 17, 2008 8:12:36 PM

Its looking like 40
a b U Graphics card
June 17, 2008 8:52:04 PM

Can't reply, busy at work going to meeting, spent too much time in that 'optimize' thread.

Just a few things.

From my understanding of how it works, GPU-Z doesn't analyze the chip like Rivatuner, it simply reads the information being provided and displays it graphically. It's not detecting 800 cores, it's detecting a product code and reporting 800 cores, it reports the frequency and such from other inputs, but it's not actually detecting cores.

Edit out ringbus comment as was looking at he R600 pic when commenting (thought it was a reply to a reply), yeah I don't know about that one, seems strange they would abandon it, but since GDDR5 doesn't has a wire length worry you are a little freer in some respects but the crossbar to PCB interface would be packed and tiny on the 55nm chip. Doesn't look right.

Anywhoo, we'll see what's what, but definitely 100% that it's 32TMUs (even the early 3DD/Vantagemarks confirm it, but I don't know about the SPUs, I just feel that 480 or a similar 2^X number makes more sense than 800, which doesn't make sense without a major redesin in how it does things (like co-issuing or having something similar to the missing mul or more trascendental/branch/flow units) or more interesting would be making RBE components effective parts of the equation.

However IMO, people are confusing parts of this equation and reporting numbers and then people are assuming they mean the same thing.

I'll comment more later.
June 17, 2008 9:48:34 PM

TheGreatGrapeApe said:
Can't reply, busy at work going to meeting, spent too much time in that 'optimize' thread.

Just a few things.

From my understanding of how it works, GPU-Z doesn't analyze the chip like Rivatuner, it simply reads the information being provided and displays it graphically. It's not detecting 800 cores, it's detecting a product code and reporting 800 cores, it reports the frequency and such from other inputs, but it's not actually detecting cores.

Edit out ringbus comment as was looking at he R600 pic when commenting (thought it was a reply to a reply), yeah I don't know about that one, seems strange they would abandon it, but since GDDR5 doesn't has a wire length worry you are a little freer in some respects but the crossbar to PCB interface would be packed and tiny on the 55nm chip. Doesn't look right.

Anywhoo, we'll see what's what, but definitely 100% that it's 32TMUs (even the early 3DD/Vantagemarks confirm it, but I don't know about the SPUs, I just feel that 480 or a similar 2^X number makes more sense than 800, which doesn't make sense without a major redesin in how it does things (like co-issuing or having something similar to the missing mul or more trascendental/branch/flow units) or more interesting would be making RBE components effective parts of the equation.

However IMO, people are confusing parts of this equation and reporting numbers and then people are assuming they mean the same thing.

I'll comment more later.


Thanks TGGA, I agree very confusing w/ so much differing speculation out there.

a b U Graphics card
June 17, 2008 10:34:15 PM

Heres something "No, just 16 ROPs w/ 4Z/clk @ >100GB bandwidth" Does this explain it all, and howd that look and perform?
June 17, 2008 10:44:33 PM

so when AMD claim to be more scalable and efficient where the RV700 got 800 SP compare to 240 in the GTX 280 thats more then 2.5 times. so should it perform 2.5 times better? from the review so far its not. so i guess its not more efficiently scalable compare to G200 then. am i right?
a b U Graphics card
June 17, 2008 11:09:49 PM

It isnt like that. The SP's actually arent the same. Its sorta like HT and QPI, they both improve or do what theyre supposed to do, but they go about it differently
a b U Graphics card
June 17, 2008 11:10:20 PM

It depends on the activity.

Look at the Techreport's review of the R600 to see the worst/best case scenario for both;

http://techreport.com/articles.x/12458/3

If you're crunching general math it's fine, if you're trying to do things with variable characteristic with dependants, etc, then you go from more parrallel to more serial.

Also remember that's just the shader math perspective, for games and such you have textures and other concerns that are also involved.

Still don't have time to look at the diagram yet, but will later tonight after work.
PS, if the diagram is real, that a big change in architecture, since it totally abandones the standard way of going things and the logical group by anything even remotely resembling a quad.


June 17, 2008 11:19:38 PM

jay i know they have obvious differences in what they do. but AMD is advertising it as 800 shader where as Nvidia is 240 for the GTX 280. according to that statement claimed by AMD they both should compare at the same level since AMD said they have the same thing but in greater number.

AMD is playing the same trick like those dodgy seller on Ebay selling quad core system with: Q6600 got 2.4X4=9.6Ghz claim.
a b U Graphics card
June 17, 2008 11:22:27 PM

I remember hearing a rumor that ATI was going to use a split clock speed (where the shaders are faster than the core like NVidia). Does anyone know if that's going to happen? It might shed some light on all this confusion.
a b U Graphics card
June 17, 2008 11:30:06 PM

Dont understand it all like some... but noticed the lack of L2, could this be the GDDR5? It shows going i/o....is this it? Is this where the RBE's would come in?
a b U Graphics card
June 18, 2008 5:17:27 AM

JDJ, the L2 is still at the end of the TU's operation it's just beside the RBEs

Well after seeing a picture of the naked die (without the flip chip cap) it's looking more plausible that this diagram may match this pic;
http://www.tweaktown.com/news/9691/amd_rv770_in_hand/in...

It looks like 10 x 4 in the pic, so if each of them contains a quad, with 5 SPUs, then you would have 10x16x5 = 800. But it also seems that the diagram cannot be taken as is and is more of a 'flow chart' of design. So those 4 memory controllers could still be stops on the ringbus, but to me anything is possible, in both directions.

So assuming that' right, then it is showing 10 x 4 TUs, which would give us 40 TUs.

This is a very different departure from the previous design,a bit of the RV670 in it, a bit of R500, and a little all new.

So while still shader heavy, that's 2.5 times as many SPUs and 2.5 times as many TUs as the R600/RV670... and all still hooked up to 16 ROPs. :D 

Well this is going to be interesting because this is really going to change things and basically even without potentially adding traditional hardware AA resolve ATi's improved their texture issues while adding so much shader power that even running shader based AA would likely be fine especially since it needs much less optimization.

The problem now is now that I have this much info I want more.

I don't even care about the benchmarks, success or fail, I just wanna know more about why this design was chosen because it seems pretty much a split from tradition and also a change from what we thought their future was. I'm interested in why the split, efficiency or a functional benefit related to the task at hand?

Man I've been reading GTX280 reviews for the past 2 days and I have a feeling none will compare with the amount of new in the RV770 reviews.

May not be faster but man it's definitely more interesting IMO.
a b U Graphics card
June 18, 2008 5:30:40 AM

iluvgillgill said:
jay i know they have obvious differences in what they do. but AMD is advertising it as 800 shader where as Nvidia is 240 for the GTX 280. according to that statement claimed by AMD they both should compare at the same level since AMD said they have the same thing but in greater number.


Well the true test is can you get them to do the math, it's a pretty simple test and unlike nV AMD got the R600 to run near theoretical max at launch, while nV took a very long time to come even close. Likely with this RV770 you will have access to most of that from the start again, although supposedly the GTX280 is starting from a high level of efficiency this time.

Quote:
AMD is playing the same trick like those dodgy seller on Ebay selling quad core system with: Q6600 got 2.4X4=9.6Ghz claim.


Nah, that would be more like nV's missing mul issue where reality didn't come close to avertised.
a b U Graphics card
June 18, 2008 5:38:48 AM

So instead of 2 1/2 x everything, they leave the ROPS, which Ive heard from elsewheres too, at 16 . This is exactly the layout w1zzard was talking about waaay back over at TPU , except the ROPS
June 18, 2008 3:35:40 PM

TheGreatGrapeApe said:
Well this is going to be interesting because this is really going to change things and basically even without potentially adding traditional hardware AA resolve ATi's improved their texture issues while adding so much shader power that even running shader based AA would likely be fine especially since it needs much less optimization.


I was thinking this when I first heard the rumors of 800 SPs weeks back. I think it is very likely a significant number of those shaders could be dedicated to AA.

TheGreatGrapeApe said:
Man I've been reading GTX280 reviews for the past 2 days and I have a feeling none will compare with the amount of new in the RV770 reviews.

May not be faster but man it's definitely more interesting IMO.

True that, looking forward to the architecture analyses (beyond3D, tech report, etc.).
a b U Graphics card
June 18, 2008 4:05:21 PM

Yep those 2 are the ones I like the most as they are usually the most in depth. The THG one is usually up there too, with some good technical analysis , but B3D usually goes much deeper. The surprising thing to me was the light on info nature of the Tech Report's GTX280 review compared to their previous ones, THG's was more technical and expository this time IMO. Hopefully they're just waiting for the A vs B vs Last Gen technical comparison once the RV770s come out.
a b U Graphics card
June 18, 2008 4:13:59 PM

B3D, home of wonders and (at least for me) headaches heheh
June 18, 2008 4:15:54 PM

It is true, I wonder why, im just so much more intrigued at the RV770 arc then the GT200, maybe because GT200 is just more of everything. The Rv770 is doing the same thing, but it seems more.....efficient, shall i say.
a b U Graphics card
June 18, 2008 5:01:59 PM

Well there are some minor tweaks to the design like adding double precision support, etc, but overall it's like G80+ 75%, even bringing back the NVIO (which increases costs for the gaming boards but reduces costs for the Tesla boards/chips [not sure if that makes sense when they already pay a premium and are about 10-20% of the overall market]).

The RV770 really is going to make it interesting for buffer usage, branching, granularity, and so many aspects which could be completely revamped with such a non traditional design that changes even the cache arrangement and TU placement.

I kinda feel like I'm going to need a few days/weeks before getting a grasp on the RV770, it's not totally new like the G80 and R600, but it's new enough to make us stop and rethink some of our beliefs about the ATi unfiied design and see how they apply (or don't) this time.
!