Saw this on VR-Zone, may be fake.
http://forums.vr-zone.com/showthread.php?t=289696
I still think RV770 will have 480 SPs and 32 TMUs, especially considering the sales web pages that were 'accidentally' posted today for the 4850. I would be awed and mystified if Ati/AMD pulled the ol' switcheroo, I guess we'll see on June 23rd. Looking forward to the RV770 launch far more than I was for today's GT200 launch.
I don't know. The vast majority of claims I've seen said 800, but that link made me wonder some too. If it is a real 800, NVidia better watch out.
its 800/5 according to AMD how they calculate their shader, so its really 160 SP.
| iluvgillgill wrote : its 800/5 according to AMD how they calculate their shader, so its really 160 SP. |
Sort of, based on R600/RV670 each 'super shader processing unit' contains 5 sub units which ati calls shaders, these shader units aren't the same or as elegant as an G80/G92/GT200 stream processing unit which can perform more operation types and aren't specifically dedicated. You can roughly divide by 5 to compare the architectures, however, I believe ATi's shaders can do more calculations per cycle.
R600 architecture: http://www.beyond3d.com/content/reviews/16/8
G80 architecture: http://www.beyond3d.com/content/reviews/1/8
The question is, has ATi made any significant changes to their shader processing unit architecture. Probably only optimizations and tweaks (like G80 to GT200), but we can always hope for more
| iluvgillgill wrote : its 800/5 according to AMD how they calculate their shader, so its really 160 SP. |
No, it would be 800 SPs.
How they are utilized depends on the dipatcher and arbiter, but it's no more 160 SPUs than the GTX280 is 10 or 30 SPUs due to their sub grouping.
Let me guess you're going to start calling the GTX SPUs 'cores' now too.
| badgtx1969 wrote :
|
That picture didn't show me anything that looked like 40 TMUs, and I think that number is a mistake of those reverse calculating a mis-print (swapped HD4870 fillrate with 4850 freq in early specs publishing).
http://www.tomshardware.com/forum/ [...] v770-800sp
I'm not even sure this confusion helps ATi, it would matter more if they were closer competitors, regardless of everything there's little else nV could do in response as they're already fast-tracking the 55nm replacement because of the 65nm parts own issues only a failure would change that, slowing down the rush maybe a little.
Hmmm, after reading your link Ape http://www.tomshardware.com/forum/ [...] v770-800sp This leaves me still guessing. Is it 640 SP's? Did they change the ratio? Or even the composition/design of the SP?
Ape, you are right it does look like 10 TMUs, however, a similar architecture diagram from R600 only shows 4 Texture Units. So does ATi cluster the TMU in 4's or is a TMU to them actually the texture filter units within each unit (the 4 pink/red boxes)?
Interesting to see that similar to R600, there are 80 ALUs per every 'Texture Unit'. Obviously there is no longer a ring bus w/ RV770 but there may be a crossbar/cache hierarchy that allows global data share?
R600:
Texture unit:
wonder if this presentation shows an actual RV770 die shot, almost looks like 160 superscaler clusters
Nobody likes a troll, i know, But.
I approve this discussion. And will read more when i get to home.
From ee times, AMD confirms that RV770 has greater than 500 'cores'
'AMD will claim technology leadership in two areas. Its chips will use more than 500 cores, more than double the 240 cores on the new Nvidia parts. They will also use GDDR5 memory interfaces running at about 3.2 Gbits/s or more. Nvidia will use the existing GDDR3 protocol running at up to 1.1 GHz on a 512-bit interface to deliver memory bandwidth up to about 102 Gbytes/s on some versions.'
http://www.eetimes.com/news/latest [...] 063&pgno=1
| jaydeejohn wrote : Hmmm, after reading your link Ape http://www.tomshardware.com/forum/ [...] v770-800sp This leaves me still guessing. Is it 640 SP's? Did they change the ratio? Or even the composition/design of the SP? |
Maybe 4850 will be 640 SP and 32 TMUs from disabling 2 rows of the diagram? I am still on the hedge whether or not said diagram is real.
I guess the SP speculaiton on rv770 still haven't crash landed
Performance per die area.... 2.8x GTX200
The RV770 die is 250 mm^2.
How big is GTX260/280?
^^ 576mm^2, so rv770 ends up having over 20% more performance.... yeah..right.... X2 maybe..
And now after performance per watt, we now get performance per mm
Whats next???
performance per TMU might look good especially if its 32
That presentation rhetoric is typical of AMD, never a raw #s comparison.
Maybe this'll help http://my.ocworkbench.com/bbs/show [...] post432681
^Nice find, now how does the math work out for TMUs? Have to run, someone else can crunch the numbers.
Its looking like 40
Can't reply, busy at work going to meeting, spent too much time in that 'optimize' thread.
Just a few things.
From my understanding of how it works, GPU-Z doesn't analyze the chip like Rivatuner, it simply reads the information being provided and displays it graphically. It's not detecting 800 cores, it's detecting a product code and reporting 800 cores, it reports the frequency and such from other inputs, but it's not actually detecting cores.
Edit out ringbus comment as was looking at he R600 pic when commenting (thought it was a reply to a reply), yeah I don't know about that one, seems strange they would abandon it, but since GDDR5 doesn't has a wire length worry you are a little freer in some respects but the crossbar to PCB interface would be packed and tiny on the 55nm chip. Doesn't look right.
Anywhoo, we'll see what's what, but definitely 100% that it's 32TMUs (even the early 3DD/Vantagemarks confirm it, but I don't know about the SPUs, I just feel that 480 or a similar 2^X number makes more sense than 800, which doesn't make sense without a major redesin in how it does things (like co-issuing or having something similar to the missing mul or more trascendental/branch/flow units) or more interesting would be making RBE components effective parts of the equation.
However IMO, people are confusing parts of this equation and reporting numbers and then people are assuming they mean the same thing.
I'll comment more later.
My head hurts...
| TheGreatGrapeApe wrote : Can't reply, busy at work going to meeting, spent too much time in that 'optimize' thread.
|
Thanks TGGA, I agree very confusing w/ so much differing speculation out there.
Heres something "No, just 16 ROPs w/ 4Z/clk @ >100GB bandwidth" Does this explain it all, and howd that look and perform?
so when AMD claim to be more scalable and efficient where the RV700 got 800 SP compare to 240 in the GTX 280 thats more then 2.5 times. so should it perform 2.5 times better? from the review so far its not. so i guess its not more efficiently scalable compare to G200 then. am i right?
It isnt like that. The SP's actually arent the same. Its sorta like HT and QPI, they both improve or do what theyre supposed to do, but they go about it differently
It depends on the activity.
Look at the Techreport's review of the R600 to see the worst/best case scenario for both;
http://techreport.com/articles.x/12458/3
If you're crunching general math it's fine, if you're trying to do things with variable characteristic with dependants, etc, then you go from more parrallel to more serial.
Also remember that's just the shader math perspective, for games and such you have textures and other concerns that are also involved.
Still don't have time to look at the diagram yet, but will later tonight after work.
PS, if the diagram is real, that a big change in architecture, since it totally abandones the standard way of going things and the logical group by anything even remotely resembling a quad.
jay i know they have obvious differences in what they do. but AMD is advertising it as 800 shader where as Nvidia is 240 for the GTX 280. according to that statement claimed by AMD they both should compare at the same level since AMD said they have the same thing but in greater number.
AMD is playing the same trick like those dodgy seller on Ebay selling quad core system with: Q6600 got 2.4X4=9.6Ghz claim.
I remember hearing a rumor that ATI was going to use a split clock speed (where the shaders are faster than the core like NVidia). Does anyone know if that's going to happen? It might shed some light on all this confusion.
Dont understand it all like some... but noticed the lack of L2, could this be the GDDR5? It shows going i/o....is this it? Is this where the RBE's would come in?
JDJ, the L2 is still at the end of the TU's operation it's just beside the RBEs
Well after seeing a picture of the naked die (without the flip chip cap) it's looking more plausible that this diagram may match this pic;
http://www.tweaktown.com/news/9691 [...] index.html
It looks like 10 x 4 in the pic, so if each of them contains a quad, with 5 SPUs, then you would have 10x16x5 = 800. But it also seems that the diagram cannot be taken as is and is more of a 'flow chart' of design. So those 4 memory controllers could still be stops on the ringbus, but to me anything is possible, in both directions.
So assuming that' right, then it is showing 10 x 4 TUs, which would give us 40 TUs.
This is a very different departure from the previous design,a bit of the RV670 in it, a bit of R500, and a little all new.
So while still shader heavy, that's 2.5 times as many SPUs and 2.5 times as many TUs as the R600/RV670... and all still hooked up to 16 ROPs.
Well this is going to be interesting because this is really going to change things and basically even without potentially adding traditional hardware AA resolve ATi's improved their texture issues while adding so much shader power that even running shader based AA would likely be fine especially since it needs much less optimization.
The problem now is now that I have this much info I want more.
I don't even care about the benchmarks, success or fail, I just wanna know more about why this design was chosen because it seems pretty much a split from tradition and also a change from what we thought their future was. I'm interested in why the split, efficiency or a functional benefit related to the task at hand?
Man I've been reading GTX280 reviews for the past 2 days and I have a feeling none will compare with the amount of new in the RV770 reviews.
May not be faster but man it's definitely more interesting IMO.
| iluvgillgill wrote : jay i know they have obvious differences in what they do. but AMD is advertising it as 800 shader where as Nvidia is 240 for the GTX 280. according to that statement claimed by AMD they both should compare at the same level since AMD said they have the same thing but in greater number. |
Well the true test is can you get them to do the math, it's a pretty simple test and unlike nV AMD got the R600 to run near theoretical max at launch, while nV took a very long time to come even close. Likely with this RV770 you will have access to most of that from the start again, although supposedly the GTX280 is starting from a high level of efficiency this time.
| Quote : AMD is playing the same trick like those dodgy seller on Ebay selling quad core system with: Q6600 got 2.4X4=9.6Ghz claim. |
Nah, that would be more like nV's missing mul issue where reality didn't come close to avertised.
So instead of 2 1/2 x everything, they leave the ROPS, which Ive heard from elsewheres too, at 16 . This is exactly the layout w1zzard was talking about waaay back over at TPU , except the ROPS
| TheGreatGrapeApe wrote : Well this is going to be interesting because this is really going to change things and basically even without potentially adding traditional hardware AA resolve ATi's improved their texture issues while adding so much shader power that even running shader based AA would likely be fine especially since it needs much less optimization. |
I was thinking this when I first heard the rumors of 800 SPs weeks back. I think it is very likely a significant number of those shaders could be dedicated to AA.
| TheGreatGrapeApe wrote : Man I've been reading GTX280 reviews for the past 2 days and I have a feeling none will compare with the amount of new in the RV770 reviews.
|
True that, looking forward to the architecture analyses (beyond3D, tech report, etc.).
Yep those 2 are the ones I like the most as they are usually the most in depth. The THG one is usually up there too, with some good technical analysis , but B3D usually goes much deeper. The surprising thing to me was the light on info nature of the Tech Report's GTX280 review compared to their previous ones, THG's was more technical and expository this time IMO. Hopefully they're just waiting for the A vs B vs Last Gen technical comparison once the RV770s come out.
B3D, home of wonders and (at least for me) headaches heheh
It is true, I wonder why, im just so much more intrigued at the RV770 arc then the GT200, maybe because GT200 is just more of everything. The Rv770 is doing the same thing, but it seems more.....efficient, shall i say.
Well there are some minor tweaks to the design like adding double precision support, etc, but overall it's like G80+ 75%, even bringing back the NVIO (which increases costs for the gaming boards but reduces costs for the Tesla boards/chips [not sure if that makes sense when they already pay a premium and are about 10-20% of the overall market]).
The RV770 really is going to make it interesting for buffer usage, branching, granularity, and so many aspects which could be completely revamped with such a non traditional design that changes even the cache arrangement and TU placement.
I kinda feel like I'm going to need a few days/weeks before getting a grasp on the RV770, it's not totally new like the G80 and R600, but it's new enough to make us stop and rethink some of our beliefs about the ATi unfiied design and see how they apply (or don't) this time.
There are 1228 identified and unidentified users. To see the list of identified users, Click here.
You are about to answer a thread that has been inactive for more than 6 months.
If you still wish to proceed, please ensure that your posting is original and does not duplicate or overlap any prior responses to this thread.

