Look, im an AMD fan. But I will always try and disseminate truth. And the truth is, the T-bred will not likely overclock well.
I would take a few pages to explain why. But I hope the folks that know me will trust me on this one. Dont expect too much. Hell, AMD may surprise me. But form everything I heard, im not holding my breathe. If im proven wrong, then great. But lets end the speculations why it hasnt overclocked well. None of you were even close. The core is NOT at its end (like I speculated). Theres still some speed left in athlon.
Now for the good news. Hammer does not have any probs at all. Though it will not debut at the speed I hoped, I am assured it will kick some serious a$$. Cache levels are still being debated. Finally, im really itching to see a 800mhz bus in action. Hypertansport is specd to run at that speed. So it should be hella fast.
-later
Benchmarks are like sex, everybody loves doing it, everybody thinks they are good at it.
well come on then.. what is the reason behind the poor oc ability of the tbred? you say none of us were even close which means you must the reason. i for one am itching to know what the problem is. i had put it down to the core being at the end of its life but you say its not..... so whats is the issue with the damn thing??
I'm out of my mind, but feel free to leave a message.
And the truth is, the T-bred will not likely overclock well....
...The core is NOT at its end (like I speculated). Theres still some speed left in athlon.
what do you mean exactly here. is there a misconception?
Finally, im really itching to see a 800mhz bus in action. Hypertansport is specd to run at that speed. So it should be hella fast.
Well, I'm not sure that I can entirely agree on 'hella fast' there. For home-use systems, sure. So far though, it's just sticking right to spec. When you actually sit down and examine it, an 800MHz HyperTransport ring <i>sounds</i> good, but turns out to be completely empty.
The HyperTransport technology according to what I've read is 16-bit upstream and 16-bit downstream. At 800MHz, that means that they're only using a DDR implementation to achive the 6.4GB/s that the primary HyperTransport ring will be using. Which sounds just fine for a home use PC.
But you also have to keep in mind that this 6.4GB/s is actually 3.2GB/s upstream and 3.2GB/s downstream. So only in a best case scenario 6.4GB/s will be realized. Realisticly, it's about as likely to be realized as a DSL connection maxing out both the upload and download streams. Usually the majority of data is only flowing one direction at a time. This will mean most data flow will be capped at 3.2GB/s, the maximum capacity of one direction in HyperTransport.
Now add in the fact that the AGP card can no longer directly access memory. The memory controller is now on the CPU. AGP 8x, the new emerging standard, has a maximum bandwidth of 2.13GB/s. So for a 1-CPU system, the 3.2GB/s of a HyperTransport stream should be more than enough. But what happens when you have a dualie system? That AGP 'bus' might end up getting squeezed by communication between the two CPUs.
Take that one step further down the AMD product line though. Picture a quad-CPU 64-bit Opteron running on a 6.4GB/s HyperTransport hub. Each processor has it's own memory banks. So if ProcessB in Processor1 needs to access Processor2's memory for ProcessA, that means all that data needs to travel through the HyperTransport hub. Now imagine a massively-used workstation, running numerous processes simultaniously, each using their own memory but also having to access each other's memory. Imagine that this is also in 64-bits, where each 'object' in memory takes up twice as many bytes. Now put an AGP 8x card back into the picture, which is also trying to use up to 2.13GB/s on that 6.4GB/s ring. Is an 800MHz DDR 16-bit bi-directional hub going to cut it? (And does this sound strangely like RDRAM to anyone else here?) Or is it going to become a massive bottleneck for a 64-bit 4-way AGP graphics image render box running well-designed multi-threaded code? Going by specs, it would make image rendering software being run on a 4-way Opteron completely useless, that's for sure. You'd probably be better off trying to use only a 2-way Opteron because otherwise, that HyperTransport will become a massive bottleneck.
Let us hope that by the time AMD releases the 4-way Opterons, they come up with a better implementation of Hypertransport than just an 800MHz DDR 16-bit bi-directional ring. (Still sounds a lot like RDRAM to me...) Upping the clock would help, but more so than that, making it either QDR or dare I suggest, make it 64-bit (DDR SDRAM's datapaths anyone?) and really give those 4-way Operon systems the HyperTransport bandwidth that they'll need. Until AMD does something like this for HyperTransport, for 4-way Opterons the 800MHz will be as empty as the 1.6GHz of an Intel 'Wilty' Celeron.
Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
But you also have to keep in mind that this 6.4GB/s is actually 3.2GB/s upstream and 3.2GB/s downstream. So only in a best case scenario 6.4GB/s will be realized. Realisticly, it's about as likely to be realized as a DSL connection maxing out both the upload and download streams. Usually the majority of data is only flowing one direction at a time. This will mean most data flow will be capped at 3.2GB/s, the maximum capacity of one direction in HyperTransport.
3.2GB/sec is still an enormous amount of bandwidth for a PC, especially considering that it's only going to be used to communicate with AGP and the southbridge.
Quote :
Now add in the fact that the AGP card can no longer directly access memory. The memory controller is now on the CPU. AGP 8x, the new emerging standard, has a maximum bandwidth of 2.13GB/s. So for a 1-CPU system, the 3.2GB/s of a HyperTransport stream should be more than enough. But what happens when you have a dualie system? That AGP 'bus' might end up getting squeezed by communication between the two CPUs.
Probably not. Decent 3D cards these days are designed so that they don't do much access to main memory. They mainly load the texture and bump maps into extremely fast onboard memory and then just pass basic wireframe geometric data back and forth across the AGP bus.
Quote :
Now imagine a massively-used workstation, running numerous processes simultaniously, each using their own memory but also having to access each other's memory. Imagine that this is also in 64-bits, where each 'object' in memory takes up twice as many bytes.
This does become a problem if threads are switched back and forth between one CPU and another. Of course, threads don't get switched much between CPUs for reasons much like this--it complicates things. Usually this switching occurs only in two circumstances:
1) One CPU is pathologically loaded, another is spinning its wheels, and the process scheduler can't find enough new work to pass over to the idle CPU. How often does this happen? Not often under a heavy workload. It's most likely to happen when the workload is reaching a lull, in which case the slowdown might not even make a practical difference.
2) A process is being swapped back into memory from disk. In this case, the bottleneck is the disk I/O.
Quote :
Going by specs, it would make image rendering software being run on a 4-way Opteron completely useless, that's for sure.
Not really. Remember, rendering software is often designed to be run on rendering clusters, where each CPU is even more isolated than the CPUs in a 4-way Opteron system. The workload can be split up among several CPUs, and each one can work independently; video memory never has to be touched until the final raytraced image is ready for viewing.
<pre>We now <b>return</b>(<font color=blue>-1</font color=blue> ) to an irregular program scheduler.</pre><p>
Then it has to be mfgr problems. Everyone was trying to avoid saying it since there were valid speculations like packaging, core re-arrangement, EOL of Athlon core etc.
Anyways, I would love to know more about hammer and also what's wrong with the T-bred.
KG
"Artificial intelligence is no match for natural stupidity." - Sarah Chambers
I don't know much about hypertransport, but I knew there is "hype" built into it. Anyways, I think you are forgetting that Claw will be single channel DDR controller and Sledge will be duel Channel DDR Controller.
KG
"Artificial intelligence is no match for natural stupidity." - Sarah Chambers
this is stupied. the situation isn't any better with regular SMT systems when 4 processors accsess the same memory directly spliting the bandwith farther...
This post is best viewed with common sense enabled
3.2GB/sec is still an enormous amount of bandwidth for a PC, especially considering that it's only going to be used to communicate with AGP and the southbridge.
As I said, it's pretty good for a single CPU system. It'll probably even suffice for a dualie. I still have qualms about it in a quad configuration though.
Quote :
Probably not. Decent 3D cards these days are designed so that they don't do much access to main memory. They mainly load the texture and bump maps into extremely fast onboard memory and then just pass basic wireframe geometric data back and forth across the AGP bus.
And when a scene doesn't change much, that works fine. However every time you have to load new textures, bump maps, vertex arrays, color arrays, etc. into the memory of the graphics card, it takes a massive main memory to AGP transfer. Maybe it's just my company, but our software reloads data like that rather frequently because the point of most of our 3D modelling software is to allow the user to edit data in 3D, so each time the user edits, you have to recalculate and reload at least some of it. Granted, we don't use nearly the memory that image rendering machines would because we have minimal textures and try to keep the verticies down because a lot of our users don't even have hardware acceleration. Still, from people I've talked to, scene changes can really max out that bandwidth.
Quote :
This does become a problem if threads are switched back and forth between one CPU and another. Of course, threads don't get switched much between CPUs for reasons much like this--it complicates things. Usually this switching occurs only in two circumstances:
1) One CPU is pathologically loaded, another is spinning its wheels, and the process scheduler can't find enough new work to pass over to the idle CPU. How often does this happen? Not often under a heavy workload. It's most likely to happen when the workload is reaching a lull, in which case the slowdown might not even make a practical difference.
2) A process is being swapped back into memory from disk. In this case, the bottleneck is the disk I/O.
Again, maybe my company's software is unique or something, but some of our multi-threaded code really generates a lot of short-lived processes and threads which all contain data taken from a single source (so that they're always using up to date data). Not that our software needs a 4-CPU system (in fact we still debate if it even needs a 2-CPU system) but if our software got into heavy calculations, it would totally kill a 4-way Opteron. And we're toying with the idea of 64-bit for improved accuracy. So far, AMD's HyperThreading has disappointed us. We were hoping for a solution that wouldn't need a massive software rewrite to be usable.
Quote :
Not really. Remember, rendering software is often designed to be run on rendering clusters, where each CPU is even more isolated than the CPUs in a 4-way Opteron system. The workload can be split up among several CPUs, and each one can work independently; video memory never has to be touched until the final raytraced image is ready for viewing.
A single 4-way Opteron system might do better than a 4 single CPU system render farm, but a 4 dual-Opteron system render farm will probably do just as well or better than a 4 quad-Opteron system render farm.
My point though is that the HyperTransport's bandwidth can cause 4-way Opteron systems to scale vary badly in at least some (if not many) of the applications that will run on 4-way Opterons. It would be sad to see such a thing happen because the Opteron sounds like a good chip. So I am hoping that AMD sees this possability and addresses it before 4-way Opterons are released by increasing the bandwidth of HyperTransport in some way, at least on 4-way Opteron boards.
Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
I don't know much about hypertransport, but I knew there is "hype" built into it. Anyways, I think you are forgetting that Claw will be single channel DDR controller and Sledge will be duel Channel DDR Controller.
Um, no. I'm not forgetting that because it has nothing to do with HyperTransport.
HyperTransport is the mechanism for the CPUs to talk to each other, access each other's memory banks, and work with the AGP card. The memory controller of the chips has nothing to do with HyperTransport's performance.
Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
this is stupied. the situation isn't any better with regular SMT systems when 4 processors accsess the same memory directly spliting the bandwith farther...
Except that we already know and expect that particular performance degredation. The combination of each CPU having it's own seperate memory bank along with HyperTransport however has the potential to alleviate a lot of that.
However, a 4-way Opteron system can be bottlenecked by HyperTransport if AMD doesn't improve HyperTransport's bandwidth. So the seperate memory banks would solve one problem, only to recreate an equivalent to the same problem by too low of a bandwidth in HyperTransport for 4-CPU systems. It seems odd to me that AMD would even risk that. So I hope that by the time they release a quad-Opteron motherboard, they'll have a buffed-up HyperTransport specifically for it to ensure that there is no bottleneck. If they can do that, then 4-way Opteron systems will really kick arse.
Tech support said take a screen shot.
Putting it down with my .22 was the humane thing to do.
Well, just so you know. AMD is approaching the quad setup COMPLETELY different than any other platform.
The know the probs with on-die memory controllers and 4-way systems. LAtency, access times etc etc. ANyway, they know the probs and limitations and are addressing it. Scratch that, they have already addressed it.
As for why T-bred has issues... 3 words. open forums=bad
Benchmarks are like sex, everybody loves doing it, everybody thinks they are good at it.
yeah i know everyone here gets a little woody from ocing. but i think it's flipping hilarious that i keep seeing people saying amd messed up or whatever. weird, i could've sworn that the new tbred was running at the clockspeed/pr# that they specified, i could be wrong though.
actully its not completly diffrente - its ALOT like what Compaq did with thier latest Alpha EV7 ptocessor...
not very suprising given who worked in Digital on EV7 and who are they wotking for now
This post is best viewed with common sense enabled
hey iiB,
do you know how AMD is going to lay out the chipsets for the 4-way systems?
If so, I would love to hear what you think it is. Cuz unless your real sure about the 4-way architecture how do you know its the same as Compaq's? So.. if you can guess how a 4-way opteron system will get around the potential bottlenecks of hypertransport, Ill give you $50. Good luck
Benchmarks are like sex, everybody loves doing it, everybody thinks they are good at it.
Ultimately Slvr, all you've said here is speculation on your admittedly limited knowledge on how AMD will implement the Quad Opteron.
They may have a HT bus between each CPU in anything above a dual. They may increase the bandwidth. They might decide to build all 4 CPUs into one huge core. (Am I even close Texas?) I'm sure we'll find out in time.
I remember reading something about there being double the bandwidth in HT on Opteron systems, but I could be wrong.
"Meesa thinks that yousa gonna die" - Darth Darth Binks