RV570: 256-bit mem, 10% faster than X1800 XT mainstream part

Wow. That's a helluva 'mainstream' part Ati has coming up...

Told, ya', look out for the X1700 (although the GF7600 is damn fine too). Can't wait 'til they put that thing in a laptop. The X1700 seems perfect for my laptop and desktop needs.

Of course on the desktop side there's always the G80, R600, etc.

But for a 'mid-range' part that's really sweet!
 
Because I feel it's the R9500Pro of this generation. :lol:

You're right, could be anything, but based on the specs it seems to match the specs of the rumoured X1700, although it sound like more than the rumoured tweaks.

Either way I hope it makes a nice notebook part, that with a Merom or X2Turion would be perfect for me.
 

Action_Man

Splendid
Jan 7, 2004
3,857
0
22,780
It probably has a 128 bus, the internal one is probably 256. Though if it does have a 256bit and lives up to its claim it'll sell like hot cakes that come with free diamonds or something. I don't know.
 

cleeve

Illustrious
Well, it ain't going to beat the X1800 with a 128-bit bus.

It's time for mainstream cards to get in the 256-bit game, and Ati has been taking it up the rump on the mainstream side recently... so why force a crippled 256-bit 12-pipeline X1800 GTO into service when you could just start with a 256-bit part?

Makes perfect sense to me.
 

Action_Man

Splendid
Jan 7, 2004
3,857
0
22,780
Well it is the inquirer.

I just always thought they'd get a 512 bus happening first for the high end then update the midrange with a 256. Seems like the mainstream would eat into the high end market.
 
Yeah I think they're talking about the memory interface not the ring-bus. I have a feeling it's going to be like the GF6800GS in that it's a process shrink card with specs similar to a crippled card (think X1800GTO on 80nm), where the speed difference and shader compliment (would have X1600/1900 style ratios) gives it that extra oomph despite slightly lower 'pipeline' numbers and maybe even vertex engine and ROP counts.

Just a thought as to what 'might be', and I agree (and have been saying it for a while), I'm surprised how they keep moving to ever more exotic memory on the 256bit bus to the point where they are running short on supply (X850XTPE / GF7800GTX-512) whereas they could achieve similar or better results on far more easily available current or past memory.
The Realizm uses 512bit memory bus=interface, now it's a very expensive card, but would the increased cost of GDDR3 900mhz spec memory really be cheaper to add as 256bit versus adding the support and traces for 512bit GDDR2/3 450mhz? With 512bit GDDR500 memory you would be top of the hill now, and using even the 1.26ns stuff on the X1800XLs and GF7800s would give you the equiavalent of 1600mhz GDDR3/4 memory.

It seems so easy, but there is a cost to be had from the additional traces, memory mounts, and vpu support, but I' really love to see a chip simulation of the performance differences, it might be worth it. It's definitely been a long time coming moving the mid-range to 256bit, heck the X700pro had more processing power than the R9800XT, yet it's biggest drawback was memory starvation, heck even the R9600 series saw far FAR more imporvement from overclocked memory than core.

If they were going to make anything more powerful than the X1600XT, they HAD to move it to 256bit because otherwise I doubt it'd ever be able to exploit that extra power with higher resolution or AA.
 

Action_Man

Splendid
Jan 7, 2004
3,857
0
22,780
Now that i've thought about it some more (thinking is hard work! :p ) being a 256bit card it'll have to be pretty big to fit all the pins, about 200mm2 (?) maybe higher. The smaller chip I can think of with a 256bit bus was the R300 at 218mm2.

Since such a chip would be expensive, I'm wondering what they could do to get higher yields. Perhaps go for extra execution hardware and lower the clock?
 

ltcommander_data

Distinguished
Dec 16, 2004
997
0
18,980
I don't suppose anyone knows whatever happened to all the other mid-range designs ATI has going? There's the RV570 that we've been talking about, but there's also the RV535 which is a 80nm die shrink of the current X1600s and then there's also the mysterious RV560. On the low-end there's also suppose to be a RV505, also a 80nm part, that's suppose to truly replace the X300 since the prices of the X1300 are too high. Obviously ATI can't bring to market all those parts. They don't even have enough X1xxx product numbers left. Why they decided to choose X13xx, X16xx, and X18xx to start with instead of the more spacious X12xx, X15xx, and X18xx is beyond me.

The thing is, I don't see how ATI expects to make a mid-range part that can perform 10% faster than the X1800XT. The 80nm process offers them more room, but not that much room for more shaders. With the X1800XT already clocking the core at 625MHz, I don't see the 80nm process offering much more clock speed with sufficient yields for the higher volume mid-range market. Going to 256-bit is nice, but with the X1800XT having it's RAM at 1500MHz, I don't see a mid-range card having more memory bandwidth than that.

What ATI is probably doing is designing the RV570 to compete in the around $200 mark where the 7600GT is, but in the $250-$300 mark high-mid range sector to replace the X1800XL. The higher end market gives them more money to place with allowing faster memory and more die-space. Something like 6 vertex shaders, 36 pixel shaders, 12 texture units, 12 ROPs, and 16 Z compare units would be reasonable. To really offer 10% more performance than the X1800XT, the $300 part would need to be clocked upwards of 600MHz, probably 650MHz is reasonable given the 80nm process. Coupled with comparable 1500Mhz RAM, the RV570 would probably still be weaker than the X1800XT in texture ability, but obviously faster in pixel shader ability which is what ATI really cares about. It of course has less vertex power. A $250 part could tone down the core to 550MHz and the memory to 1200MHz.

To really compete against the 7600GT, ATI will probably have to use the RV560, whatever it is. However, the RV570 is designed it doesn't seem possible for it to offer 10% more performance than the X1800XT yet still fit a $200 price tag. The RV535 die shrink of the X1600 might offer higher clock speeds, but I don't see how much higher than 590MHz you can go with sufficient yields to overcome the fact that the RV530 has a third the texture units as the 7600GT and half the ROPs. The real issue is of course memory bandwidth. Like the RV570, the RV560 would probably also need to have a 256-bit memory interface. Something like 5 vertex shaders, 24 pixel shaders, 8 texture units, 8 ROPs, and 12 Z compare units should work out nicely. With a decent clock speed like 600MHz and 1200Mhz 256-bit RAM, the RV560 should compete quite nicely with the 7600GT even if nVidia decides to enable that supposed hidden extra quad. With the 80nm process it should still fit the $200 price tag. A model with lower clock speed, maybe 500MHz and 1000MHz 256-bit RAM could then fit the $150 price range.

If the RV570 needs a 256-bit interface to offer sufficient bandwidth, that means ATI is probably sticking with GDDR3 for that part. Personally, I'd like to see the RV560 stick with the 128-bit interface, but move to higher clock speed GDDR4. Some 2.4GHz or 2.8GHz stuff would certainly relieve the strain just as well as a 256-bit interface but allow ATI to keep die size down and introduce GDDR4 to the market and ensure viability before integrating it into the RV580.

ATI could of course also improve performance through optimization much like nVidia did with the 7900, but I don't think that would lead to a substantial increase in performance. Most of the 7900's performance gains were through extra clock speed anyways. The RV530 core also already includes the Fetch4 feature that the X1900 uses to try to improve it's texture bottleneck. I'd be interested to know whether the RV570 has an expanded ring bus to 512-bit like the R520 and R580 to complement it's expanded 256-bit memory interface or will it just stick to the 256-bit ring bus. The latter may be more likely to save die space.

I wonder if ATI also plans to convert the R580 to 80nm? It'll certainly help them control costs and temperatures. Some modest increases in core clock (hopefully to 700MHz) and addition of GDDR4 would increase it's performance nicely. Although 24 texture units would be nice, I don't see ATI backing down from their 3:1 implementation and it would require too much work to do when their focusing on R600. A simple die shrink refresh to the X1950 shouldn't be particularly time-consuming and would allow them to definitively put themselves on top of the 7900GTX.

In any case, this is mostly speculation on my part, but I kind of just felt like going on a tangent. I hope I'm making some sense.
 
Now that i've thought about it some more (thinking is hard work! :p ) being a 256bit card it'll have to be pretty big to fit all the pins, about 200mm2 (?) maybe higher. The smaller chip I can think of with a 256bit bus was the R300 at 218mm2.

True although they could use an external memroy interace, but that's not practical either and loses them any benifit of the Ringbus IMO.

Since such a chip would be expensive, I'm wondering what they could do to get higher yields. Perhaps go for extra execution hardware and lower the clock?

Well I would think they'll pick optimal speeds/yelds as a give, but you are right the increased size in the die becomes and issue for a mid-range part, but then again he X800XL was 240mm2, so it's possible if you're trying to capture a mature market looking forward. There's likely saving to be had from a stripped down efficient design splitting functions, but I agree with your assessment of about 200mm2 so costs may be close to X800XL, but should still be a savings relative to the X1800 and X1900, so if you can make this part instead of crippling X1800s into GTOs then it makes sense, especially if it outperforms the X1800 it makes sense since it's cheaper. IMO ATi won't be looking at current competition, but future competition. I suspect/expect the X1700 to come out around the same time as the G80 unfortunately (which it were sooner), in order to have something to launch and keep themselves in the news. This card is likely meant to have a long lifespan despite new features in Vista/DX10, like the X700, and as such you'd want a card that holds it's value in the face of the new cards. The R6xx series likely won't come until the fall regardless of what comes, and likely they will go 80nm with that although 65nm has been rumoured too (I find it unlikely but anything can happen). This gives them a test run of 80nm with a relatively simple design. Granted it's far from the profitable part a smaller chip could be, but it may be a question of finding that upper-middle niche with all the re-pricing that's been going on and the push to an unblinking $299+US mid-ground (My R9600P cost $299CDN when the dollar was in the 70c area). I think the flex region is beginning to spread between basic functionality X300, occasional gaming X1300, light to moderate gaming X1600, heavy gaming X1800/1900. I think there's a gap between the X1600 and X1900, and if it requires a slightly more costly part on the die end to give saving on the board end then that may be benficial to fill that marketing gap.
That little sentence there makes me realize why neither ATi nor nV have much incentive to push hard for the 256bit midrange, they don't care as much about the cost of memory as the board makers do (ATi sells some cards, but is chip focused for sure), and the redesign just adds transistors and increase costs/shrinks margins. Hopefully the advanatages pay off or this might be a failed experiment in an area that would be benifitcial to the consumer IMO.
 

Action_Man

Splendid
Jan 7, 2004
3,857
0
22,780
Another thing they could do is disable pixel shaders and/or vertex shaders if they don't work and use it as a X1400 or whatever its going to be called. So they can still sell the chip and make some money.

With the clock speed thing I didn't go into enough detail. Perhaps they could take a route similar to xenos, it has about the same shader power as the x1800 but it has the higher clock where xenos has more hardware.
 
I don't suppose anyone knows whatever happened to all the other mid-range designs ATI has going? There's the RV570 that we've been talking about, but there's also the RV535 which is a 80nm die shrink of the current X1600s and then there's also the mysterious RV560. On the low-end there's also suppose to be a RV505, also a 80nm part, that's suppose to truly replace the X300 since the prices of the X1300 are too high.

But what's the motivation to launch a replacement for the X300 now? The people a replacement would target don't care about the gaming 3D features, so I doubt there's any demand/need for ATi to replace it. A die shrink to reduce costs would make alot of sense though, but would the savings be that great on a mature chip? Likely skipping 90nm and going straight from 110 to 80nm or even better 65nm is what they are considering. Wisest choice IMO.

Obviously ATI can't bring to market all those parts. They don't even have enough X1xxx product numbers left. Why they decided to choose X13xx, X16xx, and X18xx to start with instead of the more spacious X12xx, X15xx, and X18xx is beyond me.

There's lotsa room left. You forget that there's room for X1x50 parts and even x1x25/75 parts too. I don't see it as a big deal.

The thing is, I don't see how ATI expects to make a mid-range part that can perform 10% faster than the X1800XT. The 80nm process offers them more room, but not that much room for more shaders. With the X1800XT already clocking the core at 625MHz, I don't see the 80nm process offering much more clock speed with sufficient yields for the higher volume mid-range market.

But if their yield ratio is the same @ 650-700mhz then the die size savings offer something additional. Add to that the fact that the X1800 no longer matters other than as a legacy part (sinxe the X1900 is much larger and better performing) it's like so many other parts where is gets abandoned. The most important factors for this new chip are cost to produce vis-a-vis X1900 and 1600, and performance level with respect to both the X1600 and X1900 (which it may or may not replace [might only shift down the ladder]).

Going to 256-bit is nice, but with the X1800XT having it's RAM at 1500MHz, I don't see a mid-range card having more memory bandwidth than that.

The X1800XT RAM is spec'ed for 800/1600, and the X1900XTX is 900/1800, so getting some cheaper memory as found on the X1600XT (1.4ns) would allow it to operate near the same speeds @ 700mhz, for 16x12 resolutions and settings the 36shaders might help it outperform with that slight memory penalty. And in the memory business the difference between 1.1 or 1.0ns GDDR3 or GDDR4 (still too young to offer great prices IMO) and 1.4 or even 1.6ns memory is likely enough to warrant the difference for board makers who bare the brunt of those costs.

What ATI is probably doing is designing the RV570 to compete in the around $200 mark where the 7600GT is,

I oubt that, I'd suspect they are looking for slightly higher with a $249-299 pricetag like they initially had in mind for the X700XT. With a card that warrants that price (only achieveable with the specs we mention here IMO) it's doable. Bring out another GF7600GT and you would be FORCED to price it at that $199. So if the cost is $5 more to produce, and you can leverage that into $50-100 more in MSRP, that would be a worthwhile boost IMO.

but in the $250-$300 mark high-mid range sector to replace the X1800XL.

OK, maybe we're agreeing here, Did you meant they AREN't targeting the GF7600GT @ $200 but the higher XL (and GTO) market?

The higher end market gives them more money to place with allowing faster memory and more die-space.

the $300 part would need to be clocked upwards of 600MHz, probably 650MHz is reasonable given the 80nm process. Coupled with comparable 1500Mhz RAM, the RV570 would probably still be weaker than the X1800XT in texture ability, but obviously faster in pixel shader ability which is what ATI really cares about.

Not sure if that's 'all they care about' but it is the focus of their design direction.

It of course has less vertex power.

A $250 part could tone down the core to 550MHz and the memory to 1200MHz.

To really compete against the 7600GT, ATI will probably have to use the RV560, whatever it is. However, the RV570 is designed it doesn't seem possible for it to offer 10% more performance than the X1800XT yet still fit a $200 price tag.

Except for the fact that the X800 (R430-based) series had simmilar pricing for a larger chip. Also remember the chips for that are of the market would be cast-off cripples or over production parts in the same manner as the plain X800.

The RV535 die shrink of the X1600 might offer higher clock speeds, but I don't see how much higher than 590MHz you can go with sufficient yields to overcome the fact that the RV530 has a third the texture units as the 7600GT and half the ROPs. The real issue is of course memory bandwidth. Like the RV570, the RV560 would probably also need to have a 256-bit memory interface.

And therein lies the problem, even with the most memory in the world, the X1600 will still not perform well againt the GF7600Gt as expressed by it's low res/settings performance. Memory alone will not bridge the gap, and they need to beat that gap, since the X800GTO and GF6800GS are already in that segment, and will remain for a short while. And if you're paying to add the 512bit bus ring, and the 256bit support, you might as well pay the premium. IMO, the X1600 replacement is not going to go after the above X1600XT performance crowd, but the X1600TX down to plain X1600 crowd. The process shrink saves them money, they don't need to increase performance.

Personally, I'd like to see the RV560 stick with the 128-bit interface, but move to higher clock speed GDDR4. Some 2.4GHz or 2.8GHz stuff would certainly relieve the strain just as well as a 256-bit interface but allow ATI to keep die size down and introduce GDDR4 to the market and ensure viability before integrating it into the RV580.

That's true but the GDDR4 memory you're talking about is too expensive, the GDDR4 @ 1066 speeds would be far cheaper and make sense for what you're talking about, moving to 1200+ is too much money for such a cheap part, and availability is unlikely to be high early, so for such a large volume part I wouldn't think it wise to start near the top of the speed spectrum. If anything the top end card like the G80 will be the ones worthy of such exotic/expensive fast memory, especially when the X1800 and X1900 already support it (supposedly the GF7900 not only doesn't support it, but also cut low end DDR support [kinda doubt that, but would save transistors]). so using GDDR4 on the X1900 as a refresh (ala R9800Pro-256) or the G80, makes more sense due to likely low volumes at such an early stage. The 800, 900 and 1066 GDDR4 seems more likely to me.

ATI could of course also improve performance through optimization much like nVidia did with the 7900, but I don't think that would lead to a substantial increase in performance.

No, because the optimization wasn't about performance, but about saving transistors (cost and power), not speed (other than the small hz increase due to lower thermals).

The RV530 core also already includes the Fetch4 feature that the X1900 uses to try to improve it's texture bottleneck. I'd be interested to know whether the RV570 has an expanded ring bus to 512-bit like the R520 and R580 to complement it's expanded 256-bit memory interface or will it just stick to the 256-bit ring bus. The latter may be more likely to save die space.


And that might be the case, because as much as the ring bus itself is an efficient way to transfer memory it offers less performance boost than the 256bit interface with the chips. I would hope they wouldn't cripple it, but to save transistors they might, and it would be an acceptable compromise IMO if that's what's require to squeeze the pennies.

[quoteI wonder if ATI also plans to convert the R580 to 80nm? It'll certainly help them control costs and temperatures. Some modest increases in core clock (hopefully to 700MHz) and addition of GDDR4 would increase it's performance nicely. Although 24 texture units would be nice, I don't see ATI backing down from their 3:1 implementation and it would require too much work to do when their focusing on R600.[/quote]

And that focus IMO is why they wouldn't bother with another refresh regardless of what the G80 has to bring to the table. If anything it makes more sense to accelerate the R600 (if it's already ready to go as they say) than to simply bring another R580 to the market. That to me would be a mistake since the G80 alone will have so many paper features as attractors, that even a 10-20% performancne boost from a new core wouldn't do it. A low-ball priced R580 with faster memory is the most economical way to approach it IMO, where they have a proven part with added performance for the time being, possible to price below the G80 (even as a loss leader), but they have to know that unless the G80 is a flop of some kind (unlikely) they can't compete head to head with the G80 with just a refresh.

A simple die shrink refresh to the X1950 shouldn't be particularly time-consuming and would allow them to definitively put themselves on top of the 7900GTX.

I don't think there's any worry of that right now, simply OC'ing the XTX's exact same memory chips to GTX levels pretty much does that already. A re-shrink would cost a minimum of $4-5 million, which would be a huge waste of money in light of the real competition at that time, the G80, not unlesss they can spin of the benifits to another market like the mobile chips which would be the only thing to me that would justify the spin. Giving away the crown uncontested for a few months and accelerating the R600 to arrive just in time for fall X-mas buying season, makes much more sense IMO.

In any case, this is mostly speculation on my part, but I kind of just felt like going on a tangent. I hope I'm making some sense.

Lotsa sense, and these are the tangents most of us like, something with no meat, but makes us all think. I'm sure at least Both Cleeve and Action Man find this interesting, I know I like thinking about what 'might be'.
Of course we could all be wrong, but no one gets points for being right, even if I did notice the SLi bridge and call that one long before they announced (and kept using those stupid "engineering test connector" excuses). 8)

As always, only time will tell.
 
Another thing they could do is disable pixel shaders and/or vertex shaders if they don't work and use it as a X1400 or whatever its going to be called. So they can still sell the chip and make some money.

Oh definitely. I expect that from every part really. An eventual X1600SE, X1800SE, etc. If the past is any indicator.

With the clock speed thing I didn't go into enough detail. Perhaps they could take a route similar to xenos, it has about the same shader power as the x1800 but it has the higher clock where xenos has more hardware.

Oh yeah, sorry, yeah they could go that route to, but we're still stuck iwth a $/performance situation where baring any true perormance issue adding more complexity would tend to increase size and reduce yield. I think their target would be the smallest die possible, for lower clock frequencies you would target that for an existing design, since you usually try to sell cards near the max freq/yield ratio, and then if you find maturation gives you more chips above the set frequency you could launch a refresh, but building in lower clocks means a less competitive part than it could be (regardless of competition) that may yield a bit more. considering the performance levels we're talking about I don't think the advantage of reducing the bins for a potetntial 50mhz savings wouldn't be good for anything but a part that didn't hit their target.
The problem with this segment is that those added transistors for the 256bit support is almost a bare minimum IMO, so it's almost like a sunken transisitor cost where you reach that stage where you have to commit to that transistor/cost bump. I suppose both actions could work, especially if yields for high frequencies parts were so low (look at the X700XT as an example), where it made more sense to make the larger more complex part at lower frequencies and sell it for the same price as the part you weren't going to be able to ship enough of to break even.

For the Xenos the requirement is different for 2 reasons, it has no competition therefore no need for the fastest possible, and it's a long term, non-enthusiast consumer part that they don't want to have expensive cooling, etc. for. Of course it looks like the Xbox's overall thermal envelpoe seems higher than they wanted since there have been so many overheating issues.

I guess in this segment you hope for the maturity of the fab since you should achieve higher yields as the chip/process matures, and this should have the effect of bringin down cost just when you need to cut them in the cycle. If your yields are alrady high from the start, you don't have that benifit to look forward to. Of course even that's a gamble, since if the frequencies are high and the maturity doesn't yield much you may never benifit no matter how long you continue. That would be the worst case scenario IMO.

OIE, I kinda lost myself there, hope it made sense, just trying to finish up at work.
 

ltcommander_data

Distinguished
Dec 16, 2004
997
0
18,980
Thanks for your comments. The feedback was appreciated.

For interests sake, it seems that the RV560 is the same as the RV570 just that the first will be produced at UMC and the latter at TSMC. It may be 80nm, but this really makes me wonder how ATI plans to price their new mainstream part. I think we agree that the RV570/RV560 is probably priced around the $249-299 range, but that once again leaves a gap in the $149-249 segment where the 7600GT sits. Even with the 80nm shrink, crazy clock speeds and faster RAM, I can't see the RV535 making up much ground against the 7600GT when the latter has a 3x advantage in TMUs, a 2x advantage in ROPs, and a 2x advantage in Z compare units.

http://www.theinquirer.net/?article=30292
 

Action_Man

Splendid
Jan 7, 2004
3,857
0
22,780
Well the X1600 is 157 million transistors and ~150mm2 so they're going to need at least another 50 million to get to 200mm2 and its on 80nm so throw on 20% more. My point being perhaps they could take the nvidia route and go for more work done per clock rather then more clocks per second. Since they have alot of transistors to use. Just a possibility I thought of. It sure will be interesting to see what they deliver.