Sign in with
Sign up | Sign in
Your question

AMD to bring SMT/CMT Solution to Bulldozer

Last response: in CPUs
Share
August 29, 2010 4:00:38 PM

I will not be bringing you any new information really, but the pieces are there. Just read.

AMD will likely implement their own form of SMT in Bulldozer, and it will be close to the CMT solution rumored early on in the year.

I'm sure those of you reading this are anxiously awaiting legit new information on Bulldozer that can give us a better guess on what kind of performance to expect, and there was a lot of talk about AMD finally implementing SMT, but in their own way. That way was rumored to be CMT (Cluster-based Multi Threading). I will assume that those reading here are at least a little familiar with SMT and CMT. (If not, google is your friend)
As I said, there were many rumors going on early this year with speculation about AMD using CMT, and how it would be done and perform. But with new information released, AMD hasn't given us a definitive answer, but they have strongly indicated that there will be 1 thread per core. I was disappointed when I read that. Until I realized something...

Well, it seems like BD will have a form of SMT, but it will not run more threads than cores. Oh, and the performance of the SMT solution should be much more scalable than intels when relating threads to performance. So, how will AMD do this?

Its very simple actually. Readers here know very little about BD really, but thats because AMD is very tight lipped about this new arch, and for good reason. It will be a monster! (Imho)
BD will feature up to 8 cores per CPU for desktop, this mean 4 modules, 2 cores per module. AMD has also stated outright that each core will run a single thread, no more. So how could they do SMT/CMT? The answer could easily be found in Thuban.

Thuban is the 6 core desktop chip in the Phenom II line of CPU's from AMD, and currently AMD's flagship line. Each 6 core CPU has some new features not found in Phenom II x4 CPU's. Some of these features are an early version of Turbo Core (basically c'n'q in reverse). This is where 3 or more cores basically shut down when not being used so that the few threads actually being used can be overclocked for better low thread performance. Bascially if you need to run 2 threads, rather than waste 4 of those 6 by doing nothing, you shut idle cores to overclock the used cores of each module, thus increasing performance inside the same power consumption and thermal levels. This is where the modular design of BD, along with the turbo core feature come together to form AMD's very nice solution to SMT.

As we all know, CPU's have hit a speed wall at about 4GHZ, so we then decided that the best way to achieve more performance, is to add multiple cores. The problem though, is that not all software is threaded to take advantage of the extra cores, so the performance is stuck at the number of cores/threads and speed of those cores.

This is where it gets exciting to me.
AMD is clearly aware that software for mainstream applications are not threaded beyond a few cores, and in most cases, a quad core is more than you will ever need, and this will be the case for some time, till higher core count CPU's become the norm and software vendors are forced to thread more to increase performance(no more relying on hardware alone to get performance). Bulldozer won't bring us 16 threads per 8 core CPU, but it could in theory run 8 threads at turbo core speeds. This is where its gets blurry in describing this to the average Joe. Sure an 8 core CPU can run 8 threads anyway, why should I care right? Lets say theres a BD 8 core 2.8ghz CPU with a turbo rating of 3.4ghz. In standard practice the AMD CPU would use its 8 real cores to process 8 threads. AMD's SMT/CMT solution will, in a way, run 8 threads on 4 turbo'd cores. Huh?

Since each BD module shares its resources, and the resources allow for 2 threads, they can "turn off" one core in each module, turbo the core, and run 2 threads across (because each BD module will have the resources to accommodate 2 cores remember). This means that rather than run 8 threads on 8 cores at 2.8ghz, you'll basically be running 8 threads on 4 cores at 3.4ghz, in similar power consumption and thermal levels.

So with software playing catchup in threading to accommodate the use of higher core count CPU's BD will launch during a time where quad core support of threads should be on its way of becoming the norm, anything beyond 4 cores will likely be few in numbers and used by much fewer people. AMD needs to target the maximum number of customers to maximize their profit and gain market share.

If benchmarking software is threaded the way it is seen today, benchmarks might favor SB noticeably, but in real world performance and in the 2-4 thread "sweet spot" BD's modular design and new form of Turbo Core should be more than a match for Intel. SB isn't doing anything major in terms of changes from nehalem, so performance increases will likely be modest. Bulldozer on the other hand is a completely new arch with a very unique way of doing things. And if everything goes well for AMD, this should be the answer they were looking for.

And yes, I am aware that many of you have likely firgured this out and I am not the first to post this, but to be honest, I haven't been able to come across anything myself that spells it out like this. Correct me if I am wrong. But for those of you who didn't see this info elsewhere, I hope this has been a good read.

No new info so no sources, all facts above are pretty common now, only my opinion of the BD SMT/CMT implementation is any different, and thats just what I think. So the only source for new info is my mind. Yay!

------------------------

Edit> After thinking about this I asked myself, "then why would they even run it at 8 core stock speeds?" Well I think that is due to power consumption/thermal levels. When I first wrote this post I had it in my head that both modes would use basically the same power and generate about the same heat. I don't think this will be the case however. After thinking about it I thought this: Maybe running at stock speeds they can maintain a lower TDP and produce less heat. But, by using the Turbo mentioned above, one core is almost turned off, only doing what it needs to do to allow the module to still process 2 threads. The other core which will be the main functioning core in the module in Turbo will bring in extra power and generate more heat. This would be a reasonable explanation for having 2 modes. Stock to pass 8 threads at lower speeds with lower TDP and heat (I'll use 95w as an ex.) And Turbo mode will be a much faster mode but will increase TDP and heat(125w for ex.). A CPU that can be tuned for the job at hand, sounds like the best of both worlds.

Bulldozer will be the Jack of all Trades, but the Master of None.
a c 131 à CPUs
a b À AMD
August 29, 2010 5:17:19 PM

you shut idle cores to overclock the used cores, thus increasing perfornce inside the same power consumption and thermal levels. This is where the modular design of BD, along with the turbo core feature come together to form AMD's very nice solution to SMT.
How is that remotely related to SMT?

Since each BD module shares its resources, and the resources allow for 2 threads, they can "turn off" one core in each module, turbo the core, and run 2 threads across (because each BD module will have the resources to accommodate 2 cores remember). This means that rather than run 8 threads on 8 cores at 2.8ghz, you'll basically be running 8 threads on 4 cores at 3.4ghz, in the same power consumption and thermal levels.
In standard practice the AMD CPU would use its 8 real cores to process 8 threads. AMD's SMT/CMT solution will, in a way, run 8 threads on 4 turbo'd cores. Huh?
I'd love to see how you explain that being able to increase performance with 8 threads. Sounds to me like it would decrease performance vs 8 threads on 8 cores. If I am wrong, then what is the point of having 8 cores and why doesn't BD just always run in the way you describe? Because according to you, it would increase performance in both multi-threading and single threading.

As we all know, CPU's have hit a speed wall at about 4GHZ
What are you talking about? The "wall" was about 2.5GHz when AMD came out with their first dual cores. About 3.0GHz in 2007 and about 4.0GHz today. That means there is no wall so far. Also, GHz is not really a measure of performance; it is only related. Don't forget IPC, which has been improved at a much faster rate than clockspeed.

Maybe I'm just misunderstanding what you are trying to describe.
m
0
l
a b à CPUs
August 29, 2010 6:52:01 PM

OP:
Quote:
AMD has also stated outright that each core will run a single thread, no more.


and

Quote:
Since each BD module shares its resources, and the resources allow for 2 threads, they can "turn off" one core in each module, turbo the core, and run 2 threads across (because each BD module will have the resources to accommodate 2 cores remember).


Assuming even AMD has not yet found a way to actually run a thread on a "turned off" core, then I don't follow how it can run 8 threads on 4 cores, given the first statement.

Admittedly I haven't paid a lot of attention to AMD's CMT vs. SMT statements, but from what I've gleaned here & there, it seems AMD is using 2 full integer pipes in each module, as opposed to Intel's solution of just adding extra hardware to enable a single integer core to switch to another thread. AMD's full extra pipe is an additional 12% die space per module, vs. Intel's <5% per core (module). As for performance, AMD is promising a lot more than 12% boost in multithreaded apps, whereas Intel's solution seems to get between -5% and+25%, depending on how many free clock cycles are in each thread being executed.

However in the future when more software is better threaded, I would think the solution would be to throw more complete cores (or AMD modules) at it. This is why AMD introduced the 6-core 1090T Thuban, and Intel the 6-core Westmere CPUs, for those who can use the extra full cores on the apps they run.
m
0
l
Related resources
August 29, 2010 7:02:57 PM

Tho, this half ways solution is a good choice, for thermals and power usage, die size etc, plus it may be very versatile, for many solutions, and somewheres I saw a graph showing its FP to be climbing higher than previous graduations/generations, meaning more of a bump than even the Int
m
0
l
Anonymous
a b à CPUs
August 29, 2010 7:36:42 PM

For some reason Toms won't send me a conf email... had to make a new account to reply...

Ok, first off I don't claim this is fact, its my wording of how I think AMD will address the SMT issue. I'll also say that its obvious that I have a hard time to put my words on paper... thats why I rambled the same stuff over and over.

Enzo: When I said "turn off" I didn't mean it actually turned off literally... thats why I used the quotes. It'll be similar to Thuban I think, where the other cores underclock I guess...
I'm not saying that this is SMT either, but AMD's answer to intels SMT solution. As fazer stated, intels SMT adds only a small percentage of extra performance to each core that has SMT. For ex. lets say intels SMT HT makes each core work like 1.2 - I'm assuming that with AMD's solution using Modules, they will have better scalability. Instead of using SMT for a 0.2 boost, AMD could OC 1 core in the module while underclocking the 2nd. That enables the OC'd core to have the resources needed to pass 2 threads. But since the core is overclocked, its faster, and I'd best that a single core running 2 threads at 2.8ghz in SMT is significantly slower than 1 Module running 2 threads at 3.4ghz(again, both threads won't run at the given ghz, but the hardware is there to make this perform faster (2 physical cores to handle 2 threads with one thread handling most of the resources in Turbo mode, the other core underclocked but providing the hardware to allow 2 fully functioning integer pipelines.)
This obviously won't translate to 2.0 scalability compared to 1.2 SMT, but I'd think it would be significantly better than 1.2

I think you made your second point before I edited again, basically the Turbo mode would have the same threads, but essentially running at higher speeds, but drawing more power and causing more heat. Ideal for gaming or other intense proggys that need good performance for only a short period of time... Basically like how someone overclocks for gaming, but run lower clock for daily use like browsing... that sort of thing.

As to fasers, like I said, I didn't mean that 1 of 2 cores would literally go to 0 and shut off... I was just wording it that wasy for the reg joes... thought I made that clear, especially with the Thuban reference... in actuality, one core OC's and handles most of the module, the other core underclocks to save power/heat while leaving the ability for 2 threads.

I dunno, maybe I'm just confused as I've never been into this kind of thing before... never even knew what an integer piepline was till a couple of months ago... learning as I go, I'm just addicted to learning about this BD arch, and CPU engineering in general.

I wrote this at work and now I'm trying to get myself back into the same mindset... was a long day.

Basically 2 cores per mod - 1 underclocks to save power/heat, the other overclocks to have a faster speed core while pulling the mod's resources - OC'd core runs the show, UC'd core allows 2 threads. So without enabling SMT in the literal sense, they both have an SMT solution, while using real cores. So I'm kinda hoping that the module is designed so that the OC'd core running most of the module will run everything at the higher clock rate while the UC'd core is simply there to enable 2 threads on physical hardware. My whole theory banks on that hope. As I said I'm not an expert by any chance.
So I guess my question back at you guys would be, if the OC'd core took the modules resources and used the second core to hardware enable a second core, would they be able to run at the OC'd cores speed? Or is each individual thread tied performance wise to its own physical core?

But yea, just wanna say I'm no expert and none of this is being offered as an explanation to the BD module design, more of my own curious idea of how it could work and me asking if this makes sense... I'm starting to think it might not heh
m
0
l
Anonymous
a b à CPUs
August 29, 2010 8:05:20 PM

Sorry, forgot to address this:

"As we all know, CPU's have hit a speed wall at about 4GHZ
What are you talking about? The "wall" was about 2.5GHz when AMD came out with their first dual cores. About 3.0GHz in 2007 and about 4.0GHz today. That means there is no wall so far. Also, GHz is not really a measure of performance; it is only related. Don't forget IPC, which has been improved at a much faster rate than clockspeed."

I didn't say or mean the 4GHZ wall was hit and then we went to dual core... but I'd use 4GHZ as my solid number as we currently have one heck of a time getting around that range per core. But as for there being no wall at all, there most certainly is. With current technology its too difficult to run cores at such high speeds as it will require too much power and produce too much heat to be cost effective for mainstream use. Single cores could get to 3+ when duals were released IIRC, but duals weren't immediately available at those speeds. It took time and engineering to get the process down right and to improve performance. When duals were at 3+ and quads were released, they were at aroun 2GHZ to 2.5-ish... So we now have Singles, duals, triples, quads and hexa's on the market, but how many are rated above 4ghz out of the box?
That is essentially why AMD decided to go with multi cores while Intel tried to get us flaming hot 10ghz CPU's with a TDP of 1K. There are limits to how much power you can/should draw and how much heat is acceptable, this is what is causing the wall. Might be exaggerating a bit, but you know what I'm saying...

And I haven't forgotten IPC, and I know theres more to CPU performance than the rated mhz. I simply tried to keep it simple, and not focus on every little detail. Cause I'm not writing a review/preview or anything like that, just voicing my ideas on what I think the main feature of BD's arch might be about... in a very simple nutshell.
m
0
l
a b à CPUs
August 29, 2010 8:22:20 PM

^ OK, that makes more sense now. However I thought AMD could only turbo or downclock whole modules, not parts of one. Seems to me there might be some sync issues with the decoders and out-of-order hardware if they have to service an integer core at 3+ GHz and another at 1GHz.
m
0
l
August 29, 2010 8:26:00 PM

I believe its only the 1 Int core that shuts down, it doesnt slow down, unless its carrying its own thread, if I remember right (been gone for awhile)
m
0
l
Anonymous
a b à CPUs
August 29, 2010 8:44:26 PM

"^ OK, that makes more sense now. However I thought AMD could only turbo or downclock whole modules, not parts of one. Seems to me there might be some sync issues with the decoders and out-of-order hardware if they have to service an integer core at 3+ GHz and another at 1GHz. "

This is where I'm mostly lost at how it will work. I certainly get your points, and they potentially throw my whole theory out the window. But if the OC'd core is controlling the resources of the module, couldn't the second UC'd core just be there for the extra int pipeline while the performance aspect of the work is handled by the much faster OC'd core?

I'm JUST getting into CPU engineering as I said, so although I understand how having 2 real cores with 2 int pipelines is much faster than 1 with SMT, I don't know the exact science behind it, therefore I have no idea if I'm making sense.

So I'm back at my other question, if 1 core took control of the modules resources at a higher clock rate due to turbo, would the other core be able to just allow the second int pipeline for the other core to essentially have 2 pipelines to work with? Or would underclocking the other core neg affect the int pipeline performance?

And I thought I read somewhere that since each module shared resources, that each core could be clocked differently depending on the needs. Is this not the case? I could swear I read somewhere John Fruehe said individual core control is possible, just like how I can clock each of my 1090T's cores now. Maybe I mixed up a few diff things I read and am running in the totally wrong direction.... I'm starting to feel like I'm in Gr.10 Accounting again where on a major test I messed up the second number in a massive chart which threw off every single calculation I did... Although the tech told me if that number was right I would have gotten a 92%, instead I walked away with 0.05%. Thats right, not 1/2 a percent, but have a single decimal percentage. I should get that framed :) 
m
0
l
August 29, 2010 8:53:00 PM

I don't believe the original poster has it right.

If the module is only supporting one thread, then the other core is idle. This is not as much about Turbo CORE boost and more about resources. In this environment the first thread has access to 100% of the shared L2 cache, so this can bring a good bump in performance.

Each core runs at stock speeds but can move up due to Turbo CORE, but we are not giving away details on that yet; I anticipate that will be around launch because that is the kind of data that helps the competitor figure out the performance of the processor.
m
0
l
Anonymous
a b à CPUs
August 29, 2010 9:14:06 PM

So the module design is more about minimizing costs while enabling more cores to be added as they share resources? And that instead of all the mumbo jumbo I said its simply to increase lower thread performance?
I guess it runs the 8 threads at the estimated 1.8 rather than having Intels SMT of 1.2.

So its not that AMD is trying to make a full 8 core monster but a 8 thread monster due to running them through 8 real cores/8 int pipelines?
If thats the case, it makes a lot more sense, and I've been way, way off.

I think I get it now. Each module is not 2 cores sharing resources but rather a hardware enabled single core with another core/int pipeline for CMT. So the 8 core/8 thread BD CPU acting essentially how a Intel Quad core would work if SMT worked like 1.8 while dropping the neg aspects of intels current SMT implementation that can actually "trip" the core and slow performance. The extra core/int pipeline is just to keep the flow going uninterupted. If thats the case, this sounds impressive.

If it works similar to as its expected I would assume it would crush a Nehalem Quad. Maybe not crush, but be obviously in front. And with SB supposedly being a small improvement over Nehalem, I'm really excited about how these will compare in the real world.
m
0
l
August 29, 2010 11:45:52 PM

I have a feeling Bulldozer will have pretty high stock clocks. At this "Bulldozer 20 questions" thing by Fruehe on AMD's website, he specifically said that since BD's approach takes less die space and because of its power saving features, Bulldozer will have much higher clock speeds than current AMD offerings. That was close to word for word of what he said. And if he means higher stock clocks than chips like the 965, then that's going to be interesting.
m
0
l
August 29, 2010 11:59:56 PM

Add into that the new doping theyll be using, which offers 10%? or so over the top of that, because of lowered thermals
m
0
l
a c 131 à CPUs
a b À AMD
August 30, 2010 1:48:45 AM

Quote:

I didn't say or mean the 4GHZ wall was hit and then we went to dual core... but I'd use 4GHZ as my solid number as we currently have one heck of a time getting around that range per core. But as for there being no wall at all, there most certainly is. With current technology its too difficult to run cores at such high speeds as it will require too much power and produce too much heat to be cost effective for mainstream use.

Ahhh. I misunderstood what you meant by wall. I thought you meant that even with the next CPU architecture we'd still be at a 4GHz wall.

So this image is a single bulldozer module (ie, two cores). I'm not sure, but from what I gathered, the two FP schedulers could act as a single 256 bit scheduler? Dunno, it's been a while since I read up on BD. I remember either anandtech or bitech having a very detailed article on the bulldozer architecture not too long ago.
Here it is:
http://www.anandtech.com/show/3863/amd-discloses-bobcat...

m
0
l
August 30, 2010 1:54:29 AM

Yes, they can do 256 sets
m
0
l
September 6, 2010 12:41:59 AM

After reading a lot more stuff about the new Arch I think I know what BD actually is now... not what I posted in the orig. Not really sure how I came to think that way... thought I had a eureka moment when it was more like a highway pileup of info that somehow turned into what I thought was a real idea...
I know full well now that BD will have a turbo feature, not sure how its being implemented at all at this moment (haven't read any new articles in a week).
I also know its basically targeted at a 4 core/8 thread intel CPU as BD's design is to group 2 integer cores together in a module with shared resources. The shared resources cut down on redundancy and allows you to add a full 2nd integer core per "traditional CPU core" to have the physical resources to push 2 threads, while only increasing a fraction of the size (more performance with less die space)
So instead of being like intel and using a real core and a "pretend" core to run 2 threads, AMD will use one "traditional CPU core" with a second integer core attached to physically push a second thread. And while not scaling as well as 2 physical "traditional" cores, it should scale far better than intels 1 real/1 fake.
Intel - 1 "traditional" core / 2 threads : 1.2 - 1.3 performance (should be tweaked upward for SB)
AMD - 1 "traditional"core / 2 physical integer cores running 2 threads : 1.6-1.8 (speculation, but an educated guess on the limited info available)
Intel has boasted SMT in the form of HT for a while now and love to throw it in AMD's face. But its going to kick them in the butt big time cause AMD wasn't sitting there doing nothing while Intel continued to taunt AMD about lack of threading. So AMD watched, they examined and they found a better way, a much better way. And after a long wait we will see it.

AMD is always playing catchup, BD should really be facing Nehalem, but AMD fell behind and now have to face SB. But AMD didn't just upgrade their arch, or go for the easy gains, cause for intel, most of the low hanging fruit have been picked. AMD knew this, and came up with what I feel is a revoutionary way of designing CPU's. Modules are the future. Although each int core doesn't have its own full resources to call it a CPU core as we know it today, it really doesn't need it. AMD found a great middle ground between SMT and more cores, and its a beautiful one at that.
B4 BD came to be and changed the way we look at a CPU (even more so), we might look at it as an 8 core BD vs a 4 core SB with SMT. When you think about it in that perspective and think about the speculated 1.2-1.3 of intels SMT vs the 1.6-1.8 of AMD's extra int core per "CPU core" and you get a very one sided battle.
If it turns out anywhere close to AMD's favour as it looks to be, and AMD will have a huge winner on its hands.
I HOPE this is another Athlon 64 X2.
For AMD to be even close to in this fight is beyond me. When you think about how Intel brought us x86 and how AMD was picked to be the #2 supplier, how Intel did everything they could right from the start to handicap AMD using both legal and illegal methods, all of which were unethical given the agreement. Think about how much marketshare intel illegally built up, that marketshare cannot just be given back to AMD so its become a perm handicap from which AMD has had to innovate its way out of.
For everything that has happened, with AMD being a generation behind, we're still looking at BD as being a contender against SB, if not better. We'll wait for real samples to say which is better though (I gotta say I'm leaning AMD though to be honest, but so few facts are out there).
How can AMD even compete with so much holding them back? I don't know. But this to me demonstrates that AMD is a much better company. Not only do they have a smaller marketshare, they have a much smaller company, smaller R&D, less engineers, less fabs, less everything, yet they stay neck and neck with a much richer, resource heavy Intel.
If you look at how much time and money each company spent and what they are producing for the price/time/resources, Intel should be years and years ahead, but they aren't. In fact they had to play catchup on more than one occasion, and that is funny to me.
If BD is what we hope it will, Intel SHOULD be releasing a 16 core/32 thread 6.5GHZ per core CPU. Not really but I like to take shots at Intel. But to be clear, I'm not an AMD fanboy, I just hate Intels business practices.
m
0
l
a b à CPUs
September 6, 2010 1:25:06 AM

To be honest, i dont think any of us fully understand bulldozer, and therefore trying to guess its architectural makeup, or performance is really fairly useless.
m
0
l
September 6, 2010 1:47:44 AM

I think most of the people on forums like this think that way ares, but we do like to discuss the possibilities or try to analyze the small amount of info we do have. I might make a lot of assumptions or guesses like many others, but I fully disclose that its pure speculation or opinion.

But just because we don't know all the facts doesn't mean we should just shut our traps :p  It's fun and even if your totally off, the discussions help people better understand the engineering aspects. I've never even read about FPU or Int cores or int pipelines or anything till BD caught my eye. It just seemed so cool to me, I had to learn more.

I'll agree that most of the talking is speculation or guessing, educated guess or not, but its far from useless.
m
0
l
a b à CPUs
September 6, 2010 1:57:57 AM

Oh dont get me wrong, it is fun and all. However, it seems every so often, everybody changes their mind, like BD went from being praised as an architectural materpriece to maybe not as good as intel in just a week. Maybe because of the Sandy Bridge info and the news, more like gossip, of it being delayed. The best person to ask is JP, and he has said FP, while may look weak, will be massively improved. From all info we have now, we know for more or less sure:

Quote:
FP will be a big jump. Keep an eye out for a Bulldozer blog about FP in the upcoming weeks. I am about halfway through it at this point. I have some engineers that are working with me on it because floating point micro ops are not my strong point, I am not an engineer.


  • Its 32nm
  • 4 modules, 8 threads, for more or less 6 full cores
  • Likely fairly high clock speed due to pipes
  • AM3+
  • Seems to be refined Turbo Core
  • Really quite good thermals and power consumption
  • Sampling begins 2010 Q4, release sometime in 2011
  • And we know we dont know too much

    Beyond those, we really dont know much at all, and we dont even know enough to make accurate predictions from just a photoshoped die shot and a few arch slides.

    m
    0
    l
    September 6, 2010 2:56:00 AM

    I don't get the "more or less 6 full cores" Care to elaborate?
    m
    0
    l
    a b à CPUs
    September 6, 2010 3:15:53 AM

    Just like with Hyper Threading. Sure you have 8 threads, but those 8 threads are sharing the 4 cores resources. So while it is a 8 thread, its closer to 5-6 core performance in only a few things that utilize it, and utilize it well. Here, you have 2 Integer Schedulers, and 1 FP scheduler. While if you had 4 modules, you would think, 4 modules, 2 cores, 8 cores. Not so. Each module isnt really 2 full cores, as it is just a different (and far better) take on HT. They share L2 cache, front end engine, and FP. Therefore a "dual core" module is really sharing resources, and therefore doesnt really equal 2 cores. 1 module looks to be no less than 1.5 real cores equivalent, and maybe getting up to 1.75. Do the math, and 4 modules, you get basically 6-7 core performance. However it is a much more efficient way of doing things. While it isnt exactly a true octo core, it cuts down on costs a lot, as well as heat and energy consumption. Other things can be done to get octo core performance. Like i said, we dont know enough about this to say. At first glance, you think it would be weak on FP, as it has 1 for 2 Integer schedulers. Now fruehe is saying it will actually be strong with the FP, which one would think makes no sense. It all depends on a lot of things. If you ask me, AMD hit this one out of the ballpark in theory, they just have to get it out there, and get it at a good price. All we can do is wait, see benchmarks, and wait for further explanation.
    m
    0
    l
    a b à CPUs
    September 6, 2010 3:22:31 AM

    Also, AMD was going into some fairly crazy talk, and i think this is what you were getting at. We all know turbo core shuts down x cores, bump frequency up x amount. What AMD was talking about was how several parts of the CPU remain idle at points in time. AMD wanted to put these together, and do some complicated form of shutting down parts of CPU's, and making a 4 module/8 thread setup transform into a super 4 core setup, by disabling parts of the modules, then combining them together. Ill try to find the link to what AMD was saying, but its a proposition that could dramatically increase efficiency.
    m
    0
    l
    September 6, 2010 3:53:38 AM

    Ok I wasn't sure how you meant 6 cores. But I get what your saying. Same thing I think, just different wording. I basically sum it up as a module = 1 core with physical SMT. Thats how I'd explain it to someone who only knows the basics.
    The problem is AMD has changed the way we view the CPU with BD so much that its hard to make a universal explanation of how this will work.

    I think its closer to 8 cores than 4 cores with SMT. If you had to pick one of the two. Because although each module doesn't have the resources to run 2 independent cores, I don't think its necessary. We all know that not every part of the core is in constant use, some parts get next to no use.

    Right now its hard to identify what actually is a "core". So AMD decided that since most calculations are integer based, that is the best choice. So AMD looked at what was needed resource-wise for integer "cores". In using this approach you can merge the resources of 2 integer cores as having separate resources for each integer core (IC) was redundant and wasted die space. In merging the resources of 2 integer cores you decrease the amount of die space required per IC dramatically while maintaining very high throughput. This means that each IC is getting the resources it needs almost all the time, while working and acting similar to 2 cores.
    2 IC's sharing resources allows for 2 threads to be passed simultaneously with a close relation to 2 separate cores.
    Intel uses a single IC with its own resources to force 2 threads across. And although it can work at a decent gain, it can be almost useless or even cause decreases in performance.

    It's like trying to get 2 people from A to B using bikes. Sure you can put two people on a single bike and get both people to B quicker than 1 at a time, but one or both riders could fall off or slow down the trip.
    But AMD decided that they don't need 2 bikes to get to B, so they designed a double bike. Now both riders get from A to B much faster, but not quite as fast as if each had their own bike.
    But If you look at what's expected from both CPU's (SB and BD) its hard to think of a 4 core/8 thread SB competing with a 4 module/8 integer core BD.
    Based on that, BD should thread better almost guaranteed, while Intel may or may not have a decent lead in clock for clock performance. If AMD can stay close in c4c they should win. But thats just my opinion based on speculation from limited information.

    So for the record I THINK BD will beat SB soundly, but I'll hold my tongue till the facts are out.
    m
    0
    l
    a c 131 à CPUs
    a b À AMD
    September 6, 2010 7:15:15 AM

    ares1214 said:
  • Likely fairly high clock speed due to pipes

  • I've heard that rumour about pipelines in BD before. It worries me because I start to have Pentium 4 flashbacks.
    m
    0
    l
    September 6, 2010 11:22:00 AM

    Saying 6 cores is false, off by a long long way. AMD has said numerous times that the 2nd BD core in a module is worth 80% of the first when resources are being shared.

    That means 8 cores is worth 7.2 cores, using up the die space of 5 cores. It's NOT hyperthreading and never will be.
    m
    0
    l
    a b à CPUs
    September 6, 2010 12:27:07 PM

    Well for one I said 6-7 core like performance, not 6 cores. And what they said was they found in the server enviroment, 80% of the work in integer based. Thats why they have a 2:1 ratio of integers to FP. There is a difference. I also never said it is Hyperthreading. I said its like hyper threading. Hyperthreading is 2 threads sharing 1 cores resources. This module system is 2 threads sharing 1.5-1.8 cores. They arent really full cores, just like in Hyperthreading, but it is a vastly better way of doing it.
    m
    0
    l
    a b à CPUs
    September 6, 2010 12:38:52 PM

    Fuell said:
    Ok I wasn't sure how you meant 6 cores. But I get what your saying. Same thing I think, just different wording. I basically sum it up as a module = 1 core with physical SMT. Thats how I'd explain it to someone who only knows the basics.
    The problem is AMD has changed the way we view the CPU with BD so much that its hard to make a universal explanation of how this will work.

    I think its closer to 8 cores than 4 cores with SMT. If you had to pick one of the two. Because although each module doesn't have the resources to run 2 independent cores, I don't think its necessary. We all know that not every part of the core is in constant use, some parts get next to no use.

    Right now its hard to identify what actually is a "core". So AMD decided that since most calculations are integer based, that is the best choice. So AMD looked at what was needed resource-wise for integer "cores". In using this approach you can merge the resources of 2 integer cores as having separate resources for each integer core (IC) was redundant and wasted die space. In merging the resources of 2 integer cores you decrease the amount of die space required per IC dramatically while maintaining very high throughput. This means that each IC is getting the resources it needs almost all the time, while working and acting similar to 2 cores.
    2 IC's sharing resources allows for 2 threads to be passed simultaneously with a close relation to 2 separate cores.
    Intel uses a single IC with its own resources to force 2 threads across. And although it can work at a decent gain, it can be almost useless or even cause decreases in performance.

    It's like trying to get 2 people from A to B using bikes. Sure you can put two people on a single bike and get both people to B quicker than 1 at a time, but one or both riders could fall off or slow down the trip.
    But AMD decided that they don't need 2 bikes to get to B, so they designed a double bike. Now both riders get from A to B much faster, but not quite as fast as if each had their own bike.
    But If you look at what's expected from both CPU's (SB and BD) its hard to think of a 4 core/8 thread SB competing with a 4 module/8 integer core BD.
    Based on that, BD should thread better almost guaranteed, while Intel may or may not have a decent lead in clock for clock performance. If AMD can stay close in c4c they should win. But thats just my opinion based on speculation from limited information.

    So for the record I THINK BD will beat SB soundly, but I'll hold my tongue till the facts are out.


    That is a very good analysis. Mind you when i say "6 core" its because 1 module technically has the main parts of 1.5 cores. However, it will likely perform closer to an 8 core equivalent. However when resources are shared, the cores are cut in half, it wont be a full on octo core. HOWEVER, i commend AMD for doing this. If AMD just did a standard 6 or 8 core, well they are playing to intel game for the most part. Also, a 6 or 8 core using a different arch likely would have been very expensive, consumed more energy, more heat. This way, they almost take a shortcut to 8 core. They save a LOT of money, also adding in the fact its 32nm, and save heat and energy. I too think it will soundly beat SB, however "Bulldozer" is starting to take on names it isnt. The highest end "BD", which everybody thinks is called "BD" is actually called something to the effect of orochi, Scorpius and zambezi. That will almost definitely beat SB, but its almost like comparing a i7 980x to a 9850 X4. If the rumors are true, which i doubt, that BD has been pushed back to later in 2011, then BD would be 1 generation behind. Technically then it goes against Ivy Bridge. Also, SB isnt intels high end chips. On their roadmap, the 980x/990x stays their top end performer until Ivy Bridge. So not only is comparing the highest end BD to SB possibly comparing almost 2 different generations, but it is definitely comparing 2 different segments of the market. SB goes against Lynx really.
    m
    0
    l
    a b à CPUs
    September 6, 2010 8:09:44 PM

    normally i love amd. n m not ashamed to say m an amd fan. hell i even sold my c2d to get a x2 5600!!

    but honestly m ashamed oof this thread. not only r the info all wrong but the level of guestimate is just horrifyin....

    ares1214 said:
    The highest end "BD", which everybody thinks is called "BD" is actually called something
    to the effect of orochi, Scorpius and zambezi. That will almost definitely beat SB, but its almost like comparing a i7 980x to a 9850 X4. If the rumors are true, which i doubt, that BD has been pushed back to later in 2011, then BD would be 1 generation behind. Technically then it goes against Ivy Bridge


    u have redefined fanboyism.....

    anywho, does u crystall ball see any prostect of a reduction of global warming n implementation of world peace?

    n before u ask, n=and. i type feom hd2. so u use shortcuts
    m
    0
    l
    a b à CPUs
    September 6, 2010 9:35:02 PM

    sarwar_r87 said:
    normally i love amd. n m not ashamed to say m an amd fan. hell i even sold my c2d to get a x2 5600!!

    but honestly m ashamed oof this thread. not only r the info all wrong but the level of guestimate is just horrifyin....



    u have redefined fanboyism.....

    anywho, does u crystall ball see any prostect of a reduction of global warming n implementation of world peace?

    n before u ask, n=and. i type feom hd2. so u use shortcuts


    How exactly have i redefined fanboism, and in what way? :heink:  Think. If Bulldozer is pushed back to Q4 2011 like some rumors and sites have said, even Q3, it will be on time to go against Ivy Bridge, which is Q3-Q4 2011. Also, Sandy Bridge is not Intels high end. SB is actually the middle and low end, as the 980x is shown as their high end on roadmaps. Ivy Bridge replaces it as a high end. Comparing SB to BD is BD comes out at that time is like comparing the high end of 1 generation to the middle end of the last. Hence, 980x vs 9850. Comparing Lynx to SB if its out on time is a more accurate way of saying it. How is that fanboy? Next, this is a speculative thread. IE, people speculate. What info is wrong? Any real info that is false has been pointed out. So tell me, what is the problem here o wise one?
    m
    0
    l
    September 7, 2010 12:29:11 AM

    ares1214 said:
    Hyperthreading is 2 threads sharing 1 cores resources.


    Except Nehalem's hyperthreading implementation added extra units to the cores solely to allow better utilisation when running two threads. If it hadn't, the hyperthreading performance wouldn't be so much better than P4's.

    I'll be interested to see how AMD's version compares, but I suspect that anyone who's expecting it to be vastly better than Intel's hyperthreading will be disappointed.
    m
    0
    l
    September 7, 2010 12:47:29 AM

    I disagree.
    Theres just too much more dedicated HW on the BD side, as CMT is generally regarded as superior to SMT.
    Does this mean BD will kill all things Intel? Im not saying that, but matching single thread perf vs MT, it will do better comparably to Intel
    m
    0
    l
    September 7, 2010 1:23:15 AM

    JAYDEEJOHN said:
    I disagree.
    Theres just too much more dedicated HW on the BD side, as CMT is generally regarded as superior to SMT.l


    I actually agree, there does appear to be too much dedicated HW on the BD chip; having two sets of integer pipelines doesn't provide any benefit over a shared integer pipeline with less units, unless you can keep those two sets of independent integer pipelines full of instructions. Three ALUs executing three instructions are no slower than two sets of two ALUs executing three instructions.

    But, as I said, I'll be interested to see how it actually performs. I'd expect it to be faster than an Intel core because it has more transistors, but 50% faster as suggested above seems extremely unlikely.
    m
    0
    l
    a b à CPUs
    September 7, 2010 1:29:50 AM

    Who said 50%? Thats outrageous. I think AMD said 50% more throughput in their sever, as they were comparing a 16 core to a 12 core, but thats the only place i heard 50%. BTW, BD will be a beast in servers, thats almost guaranteed. :lol: 
    m
    0
    l
    September 7, 2010 1:30:18 AM

    @ MarkG
    Agreed, even where itll shine in FP
    m
    0
    l
    September 7, 2010 3:21:04 AM

    Yea I dunno why people come into threads like this and complain that its all speculation... Well duh, we say as much :p 

    It's just fun to interact with other enthusiasts and get their opinions on the little information available.

    When I think about the fact that if BD goes against SB that its a 4 module/8 thread vs a 4 core/8 thread. And given the fact that SMT seems to have its limits, and they are pretty low, I can't really say that SB will beat BD.
    Even if the info we have now is only partly true, the way it looks like is that AMD's implementation will surely be faster than Intels, theres no information to the contrary right now, not that I can find (Though like I state very often, this is opinion based, with very little real info and much speculation). It's just hard to look at the two approaches and think Intel has the upper hand.
    Now that would make BD a better threaded CPU but we don't know anything about clock speeds yet really, I've seen a few SB bench's, not sure if they are real or not but we have nothing on BD. I think between Phenom II and I7's theres not much of a real world difference thought Intel has a noticeable advantage. So if AMD can nail that aspect or at least match Intel, they should win out.

    Now I'm worried BD will launch vs IB or even a 6 or 8 core SB, if thats in the plans, not sure...

    We won't know till we get them in our hands, but its fun speculating about it for sure.
    m
    0
    l
    a b à CPUs
    September 7, 2010 4:29:53 AM

    So, if they only disable one integer pipeline when 1 core is in use, then would that mean on an 8 core BD that if 4 cores are in use, they would be able to increase performance significantly for up to 4 cores?

    I read somewhere that BD would be 1/2 the size per core of Phenom II(I believe), and AMD has said that BD will be have faster clocks then PII; if the 965 is already at 3.4ghz then we could probably be seeing a 4ghz stock clock.

    This sounds good to me, but there is also the downside that intel's 8 cores=~9cores while AMD's 8 cores=~7cores; if AMD says it should perform about 80% of a normal core then it probably won't perform that well under most conditions. I've never trusted a company enough to believe their number will be correct, since companies try to fudge it a little to make their product sound better ;)  .

    Wish I knew more about how processors worked, I would be able to at least get an idea on how each arch might perform.

    Edit: Is the FPU scheduler basically two FPU schedulers that can handle two "cores", but it isn't fully two schedulers?
    m
    0
    l
    September 7, 2010 4:41:03 AM

    Has, its a little complicated thats for sure. But just dive in and start reading, wiki and google are your friends for sure to help with certain things. I was never into technical stuff like this either but I'm starting to. And its really fun to read about imo.

    It's not really half the size of a Phenom II core because they way the cores are designed have changed. So rather than each core having 100% physical resources, they are able to merge these resources to 2 integer cores. Windows will see 8 cores on a 4 module BD CPU. IIRC it adds only 5% or 12% to the die space but adds 80% of the performance (likely 80% at best so I'd be safe and say 60-70% which is still a massive increase in threaded performance).
    m
    0
    l
    September 7, 2010 11:45:13 AM

    Here is the simple question to ask yourself:

    If power consumption and price were exactly the same, would you rather have 8 physical cores or 4 cores that had SMT?

    I am guessing that with power and price equal, you'd rather have physical cores.

    The only arguments in favor of SMT gloss over the inefficiency and bottlenecks to focus on the "savings" that customers get from the tradeoff on these shared pipelines.

    But what you really need to do is compare the actual finished product. Compare it on performance, price and power consumption. That will tell you the better design.
    m
    0
    l
    a b à CPUs
    September 7, 2010 2:30:01 PM

    Fuell said:
    Y
    1.When I think about the fact that if BD goes against SB that its a 4 module/8 thread vs a 4 core/8 thread. And given the fact that SMT seems to have its limits, and they are pretty low, I can't really say that SB will beat BD.
    2.Even if the info we have now is only partly true, the way it looks like is that AMD's implementation will surely be faster than Intels, theres no information to the contrary right now, not that I can find (Though like I state very often, this is opinion based, with very little real info and much speculation). It's just hard to look at the two approaches and think Intel has the upper hand.
    3.Now that would make BD a better threaded CPU but we don't know anything about clock speeds yet really, I've seen a few SB bench's, not sure if they are real or not but we have nothing on BD. I think between Phenom II and I7's theres not much of a real world difference thought Intel has a noticeable advantage. So if AMD can nail that aspect or at least match Intel, they should win out.

    5.We won't know till we get them in our hands, but its fun speculating about it for sure.


    1. do you have any hard tangible data/reasoning to back up your claims
    2. there is enough technical stuff to support IPC improvements from bulldozer info from AMD (which im sure u cant list, given your 1st post). by how much, it cant be told. so how is it u claim it will "surely be faster"!!!! u remind be of pre-phenom 1 launch period. when all were going nuts with the architecture, without any real sold performance info. n if things don turn out as you expect, m sure you will not be seen anymore, like the user from those time.
    3. u need to be better friends with google. amd said clock will be similar to PII, if not better. given its amd whose making the chips, i would say it will be similar at best. y do you doubt SB benchies. its done by one of the most reputed reviewers out there!! and who takes amd's side from time to time.
    for normal desktop users, i7 provides meaningless performance. but there is not much real life difference bcoz the softwares you use cant make use of it. start a number crunching multi-threaded application, and i7 leaves amd to dusk!

    4. i don know where it is fun for you, but posts like this makes my day! they just bring a big fat smile across my face, and i thank to God, that i am not as naive as you lot!
    ares1214 said:
    How exactly have i redefined fanboism, and in what way? :heink:  Think. If Bulldozer is pushed back to Q4 2011 like some rumors and sites have said, even Q3, it will be on time to go against Ivy Bridge, which is Q3-Q4 2011. Also, Sandy Bridge is not Intels high end. SB is actually the middle and low end, as the 980x is shown as their high end on roadmaps. Ivy Bridge replaces it as a high end. Comparing SB to BD is BD comes out at that time is like comparing the high end of 1 generation to the middle end of the last. Hence, 980x vs 9850. Comparing Lynx to SB if its out on time is a more accurate way of saying it. How is that fanboy? Next, this is a speculative thread. IE, people speculate. What info is wrong? Any real info that is false has been pointed out. So tell me, what is the problem here o wise one?


    apologies, i wasnt referring to release timeframe. like u mentioned, they are just rumors. im just refering to "That will almost definitely beat SB". those are some strong words, do you have any facts to back up. or do you have any knowledge to assume so? apart from new architecture about which we know nothing of performance...? im really anticipating a reply to this. because there are hints from amd's show at hotchips. but none of you mentioned it. so i would really like to know what your basing your assumption on.

    my apologies if i wrongly accuse you of being a no0b, but till now, you have not given me otherwise view.


    jf-amd said:
    Here is the simple question to ask yourself:

    If power consumption and price were exactly the same, would you rather have 8 physical cores or 4 cores that had SMT?

    I am guessing that with power and price equal, you'd rather have physical cores.


    but are we not forgetting performance? i mean, you can surely pack 32 cores, running on 1Ghz, hence lower TDP (presumably same as intels 6 core) and sell it on same price ($1000), but still, even if it lacks by 5% compared to core i7 2xxx, people will recommend it than?

    while talking about desktop, we are now at a time 8 cores is virtually pointless. but what we really need from AMD is improvement in IPC to compete with intel in high end desktop market. i know you are a server guy, but most of us here are interested in desktop CPUs.

    but i am confident, that no matter what the performance is, amd will still be a better buy, as always, which is the only reason i have been on AMD system for the last 15 years!

    but i really want to see amd return as performance leader. it has been long overdue and i for one would like to see intel fanboys go out! however, from your point of view, if amd can get the performance crown, it will surely get recommended from reviewers, which will again, increase your sale :) 

    so we need massive IPC improvements, specially from the FP.

    also i am looking forward to that article on FP. :D  hope its still not too far off :) 
    m
    0
    l
    September 7, 2010 4:37:25 PM

    Ok sarwar here's the basic situation.

    Nobody gives a *** about desktop performance except forum fanboys. An athlon 2 will give anybody all the desktop performance they need and anything else is fapping over bars in a benchmark.

    We are talking about billion dollar companies making billion dollar decisions. That's more cores, more threaded performance. Trust me, nobody who matters gives a damn about super pi.
    m
    0
    l
    a b à CPUs
    September 7, 2010 5:38:47 PM

    bobdozer said:
    Ok sarwar here's the basic situation.

    Nobody gives a *** about desktop performance except forum fanboys. An athlon 2 will give anybody all the desktop performance they need and anything else is fapping over bars in a benchmark.

    We are talking about billion dollar companies making billion dollar decisions. That's more cores, more threaded performance. Trust me, nobody who matters gives a damn about super pi.


    so?
    and servers don need higher IPC?

    higher IPC, faster the CPU can finish its work load and go to low power state. plus, say u are using all 8 cores at fullest. a CPU with higher IPC will be faster than a cpu with lower one, (both CPU being 8 cores)

    just cores will not let amd take performance crown. and has been good with multitasking. its the IPC letting them down.
    m
    0
    l
    a c 126 à CPUs
    a b À AMD
    September 7, 2010 6:08:38 PM

    bobdozer said:
    Ok sarwar here's the basic situation.

    Nobody gives a *** about desktop performance except forum fanboys. An athlon 2 will give anybody all the desktop performance they need and anything else is fapping over bars in a benchmark.

    We are talking about billion dollar companies making billion dollar decisions. That's more cores, more threaded performance. Trust me, nobody who matters gives a damn about super pi.


    Yet in some games a Athlon II performs too low.

    While the server market is the bread and butter for Intel and AMD, the desktop market is there too. They will always need to increase performance in every market to continue. If AMDs desktop CPUs never got any better, they would lose money.
    m
    0
    l
    a b à CPUs
    September 7, 2010 6:46:12 PM

    Quote:
    apologies, i wasnt referring to release timeframe. like u mentioned, they are just rumors. im just refering to "That will almost definitely beat SB". those are some strong words, do you have any facts to back up. or do you have any knowledge to assume so? apart from new architecture about which we know nothing of performance...? im really anticipating a reply to this. because there are hints from amd's show at hotchips. but none of you mentioned it. so i would really like to know what your basing your assumption on.


    my apologies if i wrongly accuse you of being a no0b, but till now, you have not given me otherwise view.



    Ok, i can see where you are coming from now. You have to remember we are in a speculative thread. I am speculating that "BD" more specifically Scorpius will beat SB, although likely in more of a server enviroment way. Look at the arch. It looks like a server number crunching monster. Also, SB wont have anything above 4 real cores. Sure it has HT, but thats not terribly useful, maybe 10-20% of a real core at best, so the 4 cores will perform like 5, MAYBE 6 cores clock for clock. Scorpius on the other hand is a 4 module, so lets call it about 7 core performance, which it about where it will be. 7>5 or 6. Thats just core count, you are right, AMD will need more than core count to beat Intel. Next is the pipelines. BD seems to be in the same situation as the pentiums, wether you think that is good or bad, due to the long pipelines, BD should be set for a fairly high clock speed. 32nm also obviously puts the clock speed higher. So high clock rates, possibly higher than SB are likely. Next, power consumption. Who cares :lol:  Finally, look at the arch, and in theory, it is a GREAT way to get a decent amount of performance at a low price, as the cores arent really full cores. Add all that together, and i think Scorpius will beat SB. BUT, like i said. When people say BD, they think high end, and really mean Scorpius. Will SB is more likely to compete against Lynx.
    m
    0
    l
    a c 126 à CPUs
    a b À AMD
    September 7, 2010 7:49:12 PM

    ^Um SB wont have have anything beyond 4 cores to start for the desktop market. Server will have more cores and in mid 2011 we will have 6 and 8 cores.

    Now if Bulldozer is delayed till Q4 2011, then Intels full line of SB will be out with Ivy Bridge around the corner.

    I find it funny that you mock Intels SMT yet a 6 core Xeon easily keeps up with a MCM 12 core Opteron.

    I still am waiting. SB looks decent and seems to have a pretty killer IGP for those wanting a low end PC. But I would probably wait till the 22nm stepping as it would benefit more in power savings. Plus Bulldozer will be out by then at least so we can see which is betetr to go with.
    m
    0
    l
    a b à CPUs
    September 7, 2010 8:09:20 PM

    Quote:
    Also, SB wont have anything above 4 real cores. Sure it has HT, but thats not terribly useful, maybe 10-20% of a real core at best, so the 4 cores will perform like 5, MAYBE 6 cores clock for clock.


    I never said SB will have more than 4 cores. And im not mocking intel, Im just saying Hyperthreading isnt a very effective way of core multiplication. I know a Intel 6 cores beats an AMD 12, but that has a lot more to do with architecture than its HT.
    m
    0
    l
    a b à CPUs
    September 7, 2010 8:49:31 PM

    ares1214 said:

    1. You have to remember we are in a speculative thread.

    I am speculating that "BD" more specifically Scorpius will beat SB, although likely in more of a server enviroment way.
    2. Look at the arch. It looks like a server number crunching monster. Also, SB wont have anything above 4 real cores. Sure it has HT, but thats not terribly useful, maybe 10-20% of a real core at best, so the 4 cores will perform like 5, MAYBE 6 cores clock for clock. Scorpius on the other hand is a 4 module, so lets call it about 7 core performance, which it about where it will be. 7>5 or 6.

    3.Next is the pipelines. BD seems to be in the same situation as the pentiums, wether you think that is good or bad, due to the long pipelines, BD should be set for a fairly high clock speed. 32nm also obviously puts the clock speed higher. So high clock rates, possibly higher than SB are likely.

    4.Next, power consumption. Who cares :lol: 

    5.Finally, look at the arch, and in theory, it is a GREAT way to get a decent amount of performance at a low price, as the cores arent really full cores. Add all that together, and i think Scorpius will beat SB. BUT, like i said. When people say BD, they think high end, and really mean Scorpius. Will SB is more likely to compete against Lynx.


    1. there is limit to the level of speculation buddy :D  also speculations need to have a ground. if i speculate that "ati 6800 series will so fast that you could emulate windows 7 to run off it and it will run faster"; my post will lose all its credibility

    2. intel servers will see upto 12 -16 cores i believe. on the contrary, HT optimized coded will run really good. on server platforms, there are quite a few of those apps these days. nonetheless, i do agree that amd apporach is better, require minimum effort from coding point of view to get it working. however, battling a 6 core Intel with a 8 core amd is not the best move. yes, if ther are priced write, like they are now, it will fly off the self. but when we have to battle a 6 core intel with a 8 core amd, it means, that amd cores are slower clock for clock and thereby less efficient, provided TDP is same. if u ask me, thats NOT a win for amd. from amd's point of view, 8 core cpu will need more transistor, therefore more expensive to make and thereby less profitable. also when a no0b comes and compares a 6 core intel and a 8 core amd only to find that they perform the same, that no0b will draw a conclusion that intel is better coz it does it will less core..again NOT a win

    3. there are two effects of deeper in pipelines; a. higher clocks can be achieved and b. it lowers IPC. this is why P4 was slower than p3 clock for clock. i keep saying IPC, coz ultimately, this is what counts. to be on the same level. and amd needs to battle a 3Ghz intel with a 3Ghz BD. coz running at higher clock will increase heat, power consumption. remember all the problems with p4 overheating and making sounds like a synchronous moter.?? not healthy at all

    4. everyi1 does, specially the server ppl. also amd banked on it during its athlon64 era so much, that it cant ignore it anymore.they started the whole energy efficiency and so it requires them a WIN in here too.

    5. "it is a GREAT way to get a decent amount of performance at a low price, as the cores arent really full cores." makes no sense to me. we donot know what the price will be. i know amd will always undercut intel. amd already has it. there is nothing to win here. amd needs the performance crown to speak for the innovation that they bring us. otherwse, they are doing all the write things and not getting the right amount of appreciation.

    u cant just say "look at the arch and see". that means nothing
    you missed the most important factors that increase the performance of BD. cache latencies are significantly lower compared to PII. L1 and L2 are on the same latencies as intel. but L3 not known as AMD did not say anything about it. why i mention latencies? to understand the effect of latencies, the best examples are the old athlon thunderbird. there were two versions of these CPU, with different cache latencies. this resulted in lower frequency athlon to outperform the higher frequency.

    you also forgot the prefetch & predicting. which is basically what i7 has. everytime it data is lost in CPU, it needs to be get them back, which kills cpu cycle. the latest prefetch will and branch predicting will help reduce that.

    you also forgot power gating, which will enable lower heat dissipation, hence higher overclock

    there are a lot more, and i can go on, but i need to sleep. find the rest as homework :p 

    jimmysmitty said:
    Now if Bulldozer is delayed till Q4 2011, then Intels full line of SB will be out with Ivy Bridge around the corner.

    I find it funny that you mock Intels SMT yet a 6 core Xeon easily keeps up with a MCM 12 core Opteron.

    I still am waiting. SB looks decent and seems to have a pretty killer IGP for those wanting a low end PC. But I would probably wait till the 22nm stepping as it would benefit more in power savings. Plus Bulldozer will be out by then at least so we can see which is betetr to go with.


    6 core xeon do beat 12 core opteron on certain benchmarks and justifies its hefty price, however, you cannot compare those 6 cores with current generation of opteron. that would be unjust. :) 

    indeed SB IGP looks good. but i think we are far from the death of IGP. SB is something like a cutting edge, so it will be pricy. if a guy can get a SB, its highly likely he will also want a discrete grafix card. Budget pcs with IGP rarely use latest CPUs. but there are exception off course.
    effect of IGP on CPU will have its effect of netbooks/notebooks. but we are still quite far away from the death of IGP in desktop.
    m
    0
    l
    a c 126 à CPUs
    a b À AMD
    September 7, 2010 9:29:40 PM

    ^I doubt the IGP will die. But the new SB IGP will up the ante. I am sure that AMD and nVidia are scrambling to make a better IGP to combat with SB until their version of Fusion appears.

    I honestly think that people are jumping the gun with BD just like they did with Barcelona. SB already has a test out in the wild with Anand so its easy to say its decent. But BD has yet to show its face and until then, its all speculation. We can think that their version of CMT/SMT will be more efficient but then there might be a downfall much like Intels first attempt with HT.

    I guess we shall see.
    m
    0
    l
    a b à CPUs
    September 7, 2010 9:33:31 PM

    sarwar_r87 said:
    1. there is limit to the level of speculation buddy :D  also speculations need to have a ground. if i speculate that "ati 6800 series will so fast that you could emulate windows 7 to run off it and it will run faster"; my post will lose all its credibility

    2. intel servers will see upto 12 -16 cores i believe. on the contrary, HT optimized coded will run really good. on server platforms, there are quite a few of those apps these days. nonetheless, i do agree that amd apporach is better, require minimum effort from coding point of view to get it working. however, battling a 6 core Intel with a 8 core amd is not the best move. yes, if ther are priced write, like they are now, it will fly off the self. but when we have to battle a 6 core intel with a 8 core amd, it means, that amd cores are slower clock for clock and thereby less efficient, provided TDP is same. if u ask me, thats NOT a win for amd. from amd's point of view, 8 core cpu will need more transistor, therefore more expensive to make and thereby less profitable. also when a no0b comes and compares a 6 core intel and a 8 core amd only to find that they perform the same, that no0b will draw a conclusion that intel is better coz it does it will less core..again NOT a win

    3. there are two effects of deeper in pipelines; a. higher clocks can be achieved and b. it lowers IPC. this is why P4 was slower than p3 clock for clock. i keep saying IPC, coz ultimately, this is what counts. to be on the same level. and amd needs to battle a 3Ghz intel with a 3Ghz BD. coz running at higher clock will increase heat, power consumption. remember all the problems with p4 overheating and making sounds like a synchronous moter.?? not healthy at all

    4. everyi1 does, specially the server ppl. also amd banked on it during its athlon64 era so much, that it cant ignore it anymore.they started the whole energy efficiency and so it requires them a WIN in here too.

    5. "it is a GREAT way to get a decent amount of performance at a low price, as the cores arent really full cores." makes no sense to me. we donot know what the price will be. i know amd will always undercut intel. amd already has it. there is nothing to win here. amd needs the performance crown to speak for the innovation that they bring us. otherwse, they are doing all the write things and not getting the right amount of appreciation.

    u cant just say "look at the arch and see". that means nothing
    you missed the most important factors that increase the performance of BD. cache latencies are significantly lower compared to PII. L1 and L2 are on the same latencies as intel. but L3 not known as AMD did not say anything about it. why i mention latencies? to understand the effect of latencies, the best examples are the old athlon thunderbird. there were two versions of these CPU, with different cache latencies. this resulted in lower frequency athlon to outperform the higher frequency.

    you also forgot the prefetch & predicting. which is basically what i7 has. everytime it data is lost in CPU, it needs to be get them back, which kills cpu cycle. the latest prefetch will and branch predicting will help reduce that.

    you also forgot power gating, which will enable lower heat dissipation, hence higher overclock

    there are a lot more, and i can go on, but i need to sleep. find the rest as homework :p 



    6 core xeon do beat 12 core opteron on certain benchmarks and justifies its hefty price, however, you cannot compare those 6 cores with current generation of opteron. that would be unjust. :) 

    indeed SB IGP looks good. but i think we are far from the death of IGP. SB is something like a cutting edge, so it will be pricy. if a guy can get a SB, its highly likely he will also want a discrete grafix card. Budget pcs with IGP rarely use latest CPUs. but there are exception off course.
    effect of IGP on CPU will have its effect of netbooks/notebooks. but we are still quite far away from the death of IGP in desktop.


    1. We are in a speculative thread. Nobody is necessarily wrong, we all have our opinions. This isnt an AMD Fanboy thread of "Why BD Will Crush Intel". Keep that in mind, not that you are acting like an AMD fanboy, just that this is all speculation based off what little info we have.

    2. I was talking about desktop. If you want to bring server into this, then that just makes things more confusing with AMD's 16 "core". So stick to desktop. ;)  Also, if an intel 6 core, and a AMD 4 module/8 "core" are in the same power envelope and cost the same, who cares? They are made differently, and therefore since the intel cores would be real cores, if the AMD beats it, it obviously isnt less efficient. Same price, same power envelope, more performance, that would be more efficient, no matter how many cores for who. Granted, we dont know intel or amd pricing.

    3. While i do agree, once again, if it can take the heat, and both have the same overclocking potential, it doesnt matter. If they both run at say 30 C idle and 50 C load and both can OC 1.2 GHz, then it doesnt matter. P4 couldnt. Also, Intels non-K chips seem to be dead for overclocking, so that might affect per dollar efficiency in this area, as even if BD loses to SB or IB, if they cant OC and the AMD can, and beats it handily when it does, well i consider that a win for AMD.

    4. I only said this as not much info is out yet. Also, aside from the server market, CPU power consumption in the desktop market generally isnt a massively important aspect. Important, yes, as important as other things, not at all.

    5. Ok, think. Is it cheaper to make 8 cores, or maybe 6. BD cut out some parts that a real 8 core would have, and therefore should cut down a lot on manufacturing cost, as well as heat and energy. Also think, If the parts they cut are only supposedly 20% of what most cpu processes are, then we are getting fairly good performance out of this module setup. Sure, it will lose to a real 8 core of the same arch, but its also likely a decent bit better than a 6 core of the same arch.

    5.5 I didnt feel like naming all those things you just did, which is why i just said, look at the arch and see :lol:  All those things should give it a considerably advantage over P2. Also, like i said before, an AMD worker was talking about them working on taking power gating to the extreme. Instead of shutting down 1 or 2 cores to bump clock speed up or energy consumption down, they would shut parts of the cores down, and then leave parts, like the FP up and running to link up with the other cores to go from a crippled 8 core, to say a super 4 core. While i dont understand this entirely, and am not sure how its possible or if its true, i still found it interesting.

    m
    0
    l
    September 7, 2010 10:47:44 PM

    jimmysmitty said:
    ^I doubt the IGP will die. But the new SB IGP will up the ante. I am sure that AMD and nVidia are scrambling to make a better IGP to combat with SB until their version of Fusion appears.


    Nvidia is making a southbridge IGP that will "rival" SB.

    AMD will release the 6450 before SB is ready. The 6450 will be the lowest end gpu AMD offers and will be at least 50% faster than SB's igp, maybe even twice as fast.

    m
    0
    l
    a c 126 à CPUs
    a b À AMD
    September 7, 2010 11:01:43 PM

    bobdozer said:
    Nvidia is making a southbridge IGP that will "rival" SB.

    AMD will release the 6450 before SB is ready. The 6450 will be the lowest end gpu AMD offers and will be at least 50% faster than SB's igp, maybe even twice as fast.


    You missed the IGP part.

    Integrated Graphics Port. Not discrete. Of course though, AMD will push the HD6450 out as fast as possible. They can't possible have a Intel IGP beat their low end discrete can they?

    AMD will be pushing a NEW IGP. The market that IGPs target is pretty wide since the majority of people who buy a PC with a IGP are not into hard core gaming and SB will do whats needed while offering the ability to game in the low end.

    This also means that people looking for a HTPC will no longer have to worry about getting a discrete GPU if they go with a Intel setup which will lower the system cost and heat output. So as of right now, even the HD6450 will be considered pointless in a HTPC.

    And don't get me wrong. I am a huge ATI fan. Love my HD4870. But facts are facts. SBs IGP is going to be able to wipe the GPU out from most HTPC solutions.
    m
    0
    l
    !