T-bred... for the record

texas_techie

Distinguished
Oct 12, 2001
466
0
18,780
Look, im an AMD fan. But I will always try and disseminate truth. And the truth is, the T-bred will not likely overclock well.
I would take a few pages to explain why. But I hope the folks that know me will trust me on this one. Dont expect too much. Hell, AMD may surprise me. But form everything I heard, im not holding my breathe.
If im proven wrong, then great. But lets end the speculations why it hasnt overclocked well. None of you were even close. The core is NOT at its end (like I speculated). Theres still some speed left in athlon.

Now for the good news. Hammer does not have any probs at all. Though it will not debut at the speed I hoped, I am assured it will kick some serious a$$. Cache levels are still being debated.
Finally, im really itching to see a 800mhz bus in action. Hypertansport is specd to run at that speed. So it should be hella fast.

-later

Benchmarks are like sex, everybody loves doing it, everybody thinks they are good at it.
 

rcj187

Distinguished
Mar 20, 2002
574
0
18,980
well come on then.. what is the reason behind the poor oc ability of the tbred? you say none of us were even close which means you must the reason. i for one am itching to know what the problem is. i had put it down to the core being at the end of its life but you say its not..... so whats is the issue with the damn thing??

I'm out of my mind, but feel free to leave a message.
 
G

Guest

Guest
the T-bred will not likely overclock well.
no, the TBred is not likely overclocking well.

I would take a few pages to explain why.
i would like as well.

And the truth is, the T-bred will not likely overclock well....
...The core is NOT at its end (like I speculated). Theres still some speed left in athlon.
what do you mean exactly here. is there a misconception?


:smile: i like toasted cpus but not AMD-inside. :smile:
 

slvr_phoenix

Splendid
Dec 31, 2007
6,223
1
25,780
Finally, im really itching to see a 800mhz bus in action. Hypertansport is specd to run at that speed. So it should be hella fast.
Well, I'm not sure that I can entirely agree on 'hella fast' there. For home-use systems, sure. So far though, it's just sticking right to spec. When you actually sit down and examine it, an 800MHz HyperTransport ring <i>sounds</i> good, but turns out to be completely empty.

The HyperTransport technology according to what I've read is 16-bit upstream and 16-bit downstream. At 800MHz, that means that they're only using a DDR implementation to achive the 6.4GB/s that the primary HyperTransport ring will be using. Which sounds just fine for a home use PC.

But you also have to keep in mind that this 6.4GB/s is actually 3.2GB/s upstream and 3.2GB/s downstream. So only in a best case scenario 6.4GB/s will be realized. Realisticly, it's about as likely to be realized as a DSL connection maxing out both the upload and download streams. Usually the majority of data is only flowing one direction at a time. This will mean most data flow will be capped at 3.2GB/s, the maximum capacity of one direction in HyperTransport.

Now add in the fact that the AGP card can no longer directly access memory. The memory controller is now on the CPU. AGP 8x, the new emerging standard, has a maximum bandwidth of 2.13GB/s. So for a 1-CPU system, the 3.2GB/s of a HyperTransport stream should be more than enough. But what happens when you have a dualie system? That AGP 'bus' might end up getting squeezed by communication between the two CPUs.

Take that one step further down the AMD product line though. Picture a quad-CPU 64-bit Opteron running on a 6.4GB/s HyperTransport hub. Each processor has it's own memory banks. So if ProcessB in Processor1 needs to access Processor2's memory for ProcessA, that means all that data needs to travel through the HyperTransport hub. Now imagine a massively-used workstation, running numerous processes simultaniously, each using their own memory but also having to access each other's memory. Imagine that this is also in 64-bits, where each 'object' in memory takes up twice as many bytes. Now put an AGP 8x card back into the picture, which is also trying to use up to 2.13GB/s on that 6.4GB/s ring. Is an 800MHz DDR 16-bit bi-directional hub going to cut it? (And does this sound strangely like RDRAM to anyone else here?) Or is it going to become a massive bottleneck for a 64-bit 4-way AGP graphics image render box running well-designed multi-threaded code? Going by specs, it would make image rendering software being run on a 4-way Opteron completely useless, that's for sure. You'd probably be better off trying to use only a 2-way Opteron because otherwise, that HyperTransport will become a massive bottleneck.

Let us hope that by the time AMD releases the 4-way Opterons, they come up with a better implementation of Hypertransport than just an 800MHz DDR 16-bit bi-directional ring. (Still sounds a lot like RDRAM to me...) Upping the clock would help, but more so than that, making it either QDR or dare I suggest, make it 64-bit (DDR SDRAM's datapaths anyone?) and really give those 4-way Operon systems the HyperTransport bandwidth that they'll need. Until AMD does something like this for HyperTransport, for 4-way Opterons the 800MHz will be as empty as the 1.6GHz of an Intel 'Wilty' Celeron.


Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
 

Kelledin

Distinguished
Mar 1, 2001
2,183
0
19,780
But you also have to keep in mind that this 6.4GB/s is actually 3.2GB/s upstream and 3.2GB/s downstream. So only in a best case scenario 6.4GB/s will be realized. Realisticly, it's about as likely to be realized as a DSL connection maxing out both the upload and download streams. Usually the majority of data is only flowing one direction at a time. This will mean most data flow will be capped at 3.2GB/s, the maximum capacity of one direction in HyperTransport.
3.2GB/sec is still an enormous amount of bandwidth for a PC, especially considering that it's only going to be used to communicate with AGP and the southbridge.

Now add in the fact that the AGP card can no longer directly access memory. The memory controller is now on the CPU. AGP 8x, the new emerging standard, has a maximum bandwidth of 2.13GB/s. So for a 1-CPU system, the 3.2GB/s of a HyperTransport stream should be more than enough. But what happens when you have a dualie system? That AGP 'bus' might end up getting squeezed by communication between the two CPUs.
Probably not. Decent 3D cards these days are designed so that they don't do much access to main memory. They mainly load the texture and bump maps into extremely fast onboard memory and then just pass basic wireframe geometric data back and forth across the AGP bus.

Now imagine a massively-used workstation, running numerous processes simultaniously, each using their own memory but also having to access each other's memory. Imagine that this is also in 64-bits, where each 'object' in memory takes up twice as many bytes.
This does become a problem if threads are switched back and forth between one CPU and another. Of course, threads don't get switched much between CPUs for reasons much like this--it complicates things. Usually this switching occurs only in two circumstances:

1) One CPU is pathologically loaded, another is spinning its wheels, and the process scheduler can't find enough new work to pass over to the idle CPU. How often does this happen? Not often under a heavy workload. It's most likely to happen when the workload is reaching a lull, in which case the slowdown might not even make a practical difference.

2) A process is being swapped back into memory from disk. In this case, the bottleneck is the disk I/O.

Going by specs, it would make image rendering software being run on a 4-way Opteron completely useless, that's for sure.
Not really. Remember, rendering software is often designed to be run on rendering clusters, where each CPU is even more isolated than the CPUs in a 4-way Opteron system. The workload can be split up among several CPUs, and each one can work independently; video memory never has to be touched until the final raytraced image is ready for viewing.

<pre>We now <b>return</b>(<font color=blue>-1</font color=blue>) to an irregular program scheduler.</pre><p>
 

Kemche

Distinguished
Oct 5, 2001
284
0
18,780
None of you were even close.

Then it has to be mfgr problems. Everyone was trying to avoid saying it since there were valid speculations like packaging, core re-arrangement, EOL of Athlon core etc.

Anyways, I would love to know more about hammer and also what's wrong with the T-bred.

KG

"Artificial intelligence is no match for natural stupidity." - Sarah Chambers
 

Kemche

Distinguished
Oct 5, 2001
284
0
18,780
Wow.

I don't know much about hypertransport, but I knew there is "hype" built into it. Anyways, I think you are forgetting that Claw will be single channel DDR controller and Sledge will be duel Channel DDR Controller.

KG

"Artificial intelligence is no match for natural stupidity." - Sarah Chambers
 

IIB

Distinguished
Dec 2, 2001
417
0
18,780
this is stupied.
the situation isn't any better with regular SMT systems when 4 processors accsess the same memory directly spliting the bandwith farther...


This post is best viewed with common sense enabled
 

slvr_phoenix

Splendid
Dec 31, 2007
6,223
1
25,780
3.2GB/sec is still an enormous amount of bandwidth for a PC, especially considering that it's only going to be used to communicate with AGP and the southbridge.
As I said, it's pretty good for a single CPU system. It'll probably even suffice for a dualie. I still have qualms about it in a quad configuration though.

Probably not. Decent 3D cards these days are designed so that they don't do much access to main memory. They mainly load the texture and bump maps into extremely fast onboard memory and then just pass basic wireframe geometric data back and forth across the AGP bus.
And when a scene doesn't change much, that works fine. However every time you have to load new textures, bump maps, vertex arrays, color arrays, etc. into the memory of the graphics card, it takes a massive main memory to AGP transfer. Maybe it's just my company, but our software reloads data like that rather frequently because the point of most of our 3D modelling software is to allow the user to edit data in 3D, so each time the user edits, you have to recalculate and reload at least some of it. Granted, we don't use nearly the memory that image rendering machines would because we have minimal textures and try to keep the verticies down because a lot of our users don't even have hardware acceleration. Still, from people I've talked to, scene changes can really max out that bandwidth.

This does become a problem if threads are switched back and forth between one CPU and another. Of course, threads don't get switched much between CPUs for reasons much like this--it complicates things. Usually this switching occurs only in two circumstances:

1) One CPU is pathologically loaded, another is spinning its wheels, and the process scheduler can't find enough new work to pass over to the idle CPU. How often does this happen? Not often under a heavy workload. It's most likely to happen when the workload is reaching a lull, in which case the slowdown might not even make a practical difference.

2) A process is being swapped back into memory from disk. In this case, the bottleneck is the disk I/O.
Again, maybe my company's software is unique or something, but some of our multi-threaded code really generates a lot of short-lived processes and threads which all contain data taken from a single source (so that they're always using up to date data). Not that our software needs a 4-CPU system (in fact we still debate if it even needs a 2-CPU system) but if our software got into heavy calculations, it would totally kill a 4-way Opteron. And we're toying with the idea of 64-bit for improved accuracy. So far, AMD's HyperThreading has disappointed us. We were hoping for a solution that wouldn't need a massive software rewrite to be usable.

Not really. Remember, rendering software is often designed to be run on rendering clusters, where each CPU is even more isolated than the CPUs in a 4-way Opteron system. The workload can be split up among several CPUs, and each one can work independently; video memory never has to be touched until the final raytraced image is ready for viewing.
A single 4-way Opteron system might do better than a 4 single CPU system render farm, but a 4 dual-Opteron system render farm will probably do just as well or better than a 4 quad-Opteron system render farm.

My point though is that the HyperTransport's bandwidth can cause 4-way Opteron systems to scale vary badly in at least some (if not many) of the applications that will run on 4-way Opterons. It would be sad to see such a thing happen because the Opteron sounds like a good chip. So I am hoping that AMD sees this possability and addresses it before 4-way Opterons are released by increasing the bandwidth of HyperTransport in some way, at least on 4-way Opteron boards.


Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
 

slvr_phoenix

Splendid
Dec 31, 2007
6,223
1
25,780
Wow.

I don't know much about hypertransport, but I knew there is "hype" built into it. Anyways, I think you are forgetting that Claw will be single channel DDR controller and Sledge will be duel Channel DDR Controller.
Um, no. I'm not forgetting that because it has nothing to do with HyperTransport.

HyperTransport is the mechanism for the CPUs to talk to each other, access each other's memory banks, and work with the AGP card. The memory controller of the chips has nothing to do with HyperTransport's performance.


Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
 

slvr_phoenix

Splendid
Dec 31, 2007
6,223
1
25,780
this is stupied.
the situation isn't any better with regular SMT systems when 4 processors accsess the same memory directly spliting the bandwith farther...
Except that we already know and expect that particular performance degredation. The combination of each CPU having it's own seperate memory bank along with HyperTransport however has the potential to alleviate a lot of that.

However, a 4-way Opteron system can be bottlenecked by HyperTransport if AMD doesn't improve HyperTransport's bandwidth. So the seperate memory banks would solve one problem, only to recreate an equivalent to the same problem by too low of a bandwidth in HyperTransport for 4-CPU systems. It seems odd to me that AMD would even risk that. So I hope that by the time they release a quad-Opteron motherboard, they'll have a buffed-up HyperTransport specifically for it to ensure that there is no bottleneck. If they can do that, then 4-way Opteron systems will really kick arse.


Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
 

texas_techie

Distinguished
Oct 12, 2001
466
0
18,780
Well, just so you know. AMD is approaching the quad setup COMPLETELY different than any other platform.

The know the probs with on-die memory controllers and 4-way systems. LAtency, access times etc etc. ANyway, they know the probs and limitations and are addressing it. Scratch that, they have already addressed it.

As for why T-bred has issues... 3 words. open forums=bad

Benchmarks are like sex, everybody loves doing it, everybody thinks they are good at it.
 

mbetea

Distinguished
Aug 16, 2001
1,662
0
19,780
yeah i know everyone here gets a little woody from ocing. but i think it's flipping hilarious that i keep seeing people saying amd messed up or whatever. weird, i could've sworn that the new tbred was running at the clockspeed/pr# that they specified, i could be wrong though.

[insert philosophical statement here]
 

IIB

Distinguished
Dec 2, 2001
417
0
18,780
actully its not completly diffrente - its ALOT like what Compaq did with thier latest Alpha EV7 ptocessor...
not very suprising given who worked in Digital on EV7 and who are they wotking for now ;)


This post is best viewed with common sense enabled
 

eden

Champion
It depends how texas meant it. He might know more than what we know, since he has the AMD contacts, so maybe he's right, or maybe you are!

--
:smile: Intel and AMD sitting under a tree, P-R-O-C-E-S-S-I-N-G! :smile:
 

texas_techie

Distinguished
Oct 12, 2001
466
0
18,780
hey iiB,
do you know how AMD is going to lay out the chipsets for the 4-way systems?
If so, I would love to hear what you think it is. Cuz unless your real sure about the 4-way architecture how do you know its the same as Compaq's?
So.. if you can guess how a 4-way opteron system will get around the potential bottlenecks of hypertransport, Ill give you $50. Good luck

Benchmarks are like sex, everybody loves doing it, everybody thinks they are good at it.
 

bront

Distinguished
Oct 16, 2001
2,122
0
19,780
Ultimately Slvr, all you've said here is speculation on your admittedly limited knowledge on how AMD will implement the Quad Opteron.

They may have a HT bus between each CPU in anything above a dual. They may increase the bandwidth. They might decide to build all 4 CPUs into one huge core. (Am I even close Texas?) I'm sure we'll find out in time.

I remember reading something about there being double the bandwidth in HT on Opteron systems, but I could be wrong.

"Meesa thinks that yousa gonna die" - Darth Darth Binks
 

eden

Champion
4CPUs in one core? Stacked or wide spread? If wide man that'd take pins... around 900 for one Opty, *4, Socket 3600 woohoo! :lol:

--
:smile: Intel and AMD sitting under a tree, P-R-O-C-E-S-S-I-N-G! :smile:
 

bront

Distinguished
Oct 16, 2001
2,122
0
19,780
I never said it was a GOOD idea :tongue:

I'd be a socket 3600 frying pan with a heat spreader :wink:

I don't think stacking would work either. You'd have to mount the HSF strangely, perhaps hanging out of the case.

"Meesa thinks that yousa gonna die" - Darth Darth Binks
 

slvr_phoenix

Splendid
Dec 31, 2007
6,223
1
25,780
Ultimately Slvr, all you've said here is speculation on your admittedly limited knowledge on how AMD will implement the Quad Opteron.
My admittedly limited knowledge? I readily admit that I don't know everything in the entire world. From that perspective I suppose I do admit that I work from a limited knowledge base. However, I think I'm pretty darn up to date on AMD's HyperThreading. Of course, I'm not an AMD engineer, so there may be unreleased information that no one has that I could be missing, but the same would be true of anyone else here since that knowledge would probably at <i>least</i> be under a Non-Disclosure Agreement anyway and therefore they couldn't talk about it even if they had that knowledge.

They may have a HT bus between each CPU in anything above a dual.
Perhaps you've missed the bus (so to speak) but HyperThreading <b>IS</b> the bus between each CPU in all Opteron systems.

They may increase the bandwidth.
That's <i>exactly</i> what I have been saying that I hope they do!

They might decide to build all 4 CPUs into one huge core. (Am I even close Texas?)
Somehow, I doubt it. Besides the fact that it would destroy their entire design layout for the platform and almost completely negate the point of even having HyperThreading, it would also be insane to put all of the CPUs right next to each other. Talk about hotter than hell...

I'm sure we'll find out in time.
Exactly! All I've been saying is that by looking at the specs, HyperThreading could be a small to massive bottleneck for quad-Opterons and I <i>hope</i> that AMD does something to fix that by the time they release quad-Opteron motherboards.

I remember reading something about there being double the bandwidth in HT on Opteron systems, but I could be wrong.
If there was (by implementing QDR instead of DDR for example) then that would solve the problem nicely and I would be glad to see it.

HyperThreading sounds cool, and <i>is</i> cool <i>for single CPU systems and probably most dualies</i>. (64-bit dual operation might find that HyperThreading is a minor bottleneck.) However, if nothing changes it has the <i>potential</i> to be a bottleneck for a quad-Opteron system. Just look at the data on HyperThreading so far and it's all there, clear as day.

Ultimately, all I'm saying is: <b><font color=blue>I HOPE THAT AMD DOES SOMETHING TO IMPROVE HYPERTHREADING'S BANDWIDTH BY THE TIME THAT THEY RELEASE A 4-WAY OPTERON.</font color=blue></b>

If anyone got <i>anything</i> else out of what I've said, then they're reading more into it than what I put there.


Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
 

slvr_phoenix

Splendid
Dec 31, 2007
6,223
1
25,780
Well, just so you know. AMD is approaching the quad setup COMPLETELY different than any other platform.
Gee, I'd have never guessed by their use of 4 memory banks and HyperThreading...

The know the probs with on-die memory controllers and 4-way systems.
I was <i>trying</i> to avoid even bringing up the massive difference in bandwidth between each individual memory controller and HyperThreading's bandwidth and how there's no way in hell that HyperThreading in its current design could ever hope to supply that kind of bandwidth between 4 CPUs if they all needed to access each other's memory banks simultaniously. But since you bring it up, do <i>you</i> have any words of wisdom on what AMD thinks about this without copping out with something akin to "I'm not at liberty to say."?

LAtency, access times etc etc.
It is my understanding that the on-die controllers are to reduce latency and access times (being generally the same thing).

ANyway, they know the probs and limitations and are addressing it. Scratch that, they have already addressed it.
Well I sure hope so, because I can see these problems and limitations a mile away and I would severely hope that this means that AMD can see them too. I hope even more that AMD addresses these problems with reasonable solutions to provide us with a quality product.

So far, all I've said is that if you look at the specs, it is clear to see problems are there, and I <b>hope</b> that AMD addresses and/or fixes those problems before the quad-Opteron motherboards are released.

Am I wrong to think that AMD engineers might actually be smart enough see this as well?

Am I wrong to <i>hope</i> that the engineers do something about it in the many months that they have before 4-way Opteron systems are released?


Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
 

ritesh_laud

Distinguished
Nov 16, 2001
456
1
18,780
However, I think I'm pretty darn up to date on AMD's HyperThreading
Apparently even more up to date than Jerry Sanders himself, since AMD doesn't plan to implement HyperThreading for a good long while. I think you meant *HyperTransport* through your last two posts :=)

Ritesh

<P ID="edit"><FONT SIZE=-1><EM>Edited by ritesh_laud on 06/19/02 10:14 AM.</EM></FONT></P>
 

slvr_phoenix

Splendid
Dec 31, 2007
6,223
1
25,780
Oops. Oh yeah. Thanks. It's just already been a loooooooooong morning trying to debug some unstable multi-threaded code. I must have displaced my thoughts from one task to another. Sorry about the confusion everyone.

I'd change the posts, but they're good for a laugh. :)

(And at this point, I could really use something to laugh at.)

At least I think I've got the bug nailed though. Evil CStrings...

But anywho, again, sorry for the confusion. I did mean HyperTransport, not HyperThreading. Oh the joys of being an over-worked under-paid programmer working on someone else's poorly-documented code...

[drones into chant]
I love my job. I love my job. I love my job.
[end drone]

Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.