Sign in with
Sign up | Sign in
Your question

Bulldozer core confuzldness

Last response: in CPUs
Share
August 28, 2010 11:26:09 AM

i was reading about the new bulldozer archetecture and im not quite sure about something. AMD said that they are going to combat their weakness in threaded apps by having their cores put into pairs. does that mean that two integer clusters (one bulldozer core) could work on a single thread, and say eight cores working in a quad-threaded app?
August 28, 2010 11:56:41 AM

No.

There may be some trickery that AMD hasn't told us about yet, but so far nobody can break a thread into two. It's more likely that Bulldozer will have insanely high clocks for single threads.
a b à CPUs
August 28, 2010 3:48:41 PM

Heh, IIRC the 'multiple cores working on a thread' is the old "reverse hyperthreading" scenario, sorta like macro-op fusing but spread across cores instead of just registers. I think the hardware requirements would be extreme.

I believe BD will be the first AMD 4-issue core, comparable to Intel's, since K8 through the current Phenoms are 3-issue cores which is one of the reasons why K8 and later has lower IPC. However, I'm not sure if the 4 decoders are per-core or per-module on BD. IIRC a BD module is 2 integer pipes with a shared FP unit. Since these are complete integer pipes, AMD calls them cores I think. In contrast, Intel's core is an integer pipe plus an FP unit plus extra hardware to let the core switch to another thread when it's not fully occupied with the current thread (hyperthreading). Intel's philosophy is that for lightly-loaded threads (i.e., <70% clock cycles used), it makes sense to switch to another thread and execute it for a while, to keep the core working as near 100% capacity as possible. AMD's BD is taking this one step further and making the second integer pipe much more complete.



Related resources
August 28, 2010 4:22:40 PM

1 thread, 1 core, unless they have a technician called BM working for them :0
a b à CPUs
August 28, 2010 5:03:24 PM

JAYDEEJOHN said:
1 thread, 1 core, unless they have a technician called BM working for them :0


Ol' Baron Matrix?? :p . He hangs out over at AMDZone nowadays. And I haven't seen a reverse-hyperthreading post from him in a couple years now :D .
a c 123 à CPUs
August 28, 2010 5:37:42 PM

fazers_on_stun said:
Heh, IIRC the 'multiple cores working on a thread' is the old "reverse hyperthreading" scenario, sorta like macro-op fusing but spread across cores instead of just registers. I think the hardware requirements would be extreme.

I believe BD will be the first AMD 4-issue core, comparable to Intel's, since K8 through the current Phenoms are 3-issue cores which is one of the reasons why K8 and later has lower IPC. However, I'm not sure if the 4 decoders are per-core or per-module on BD. IIRC a BD module is 2 integer pipes with a shared FP unit. Since these are complete integer pipes, AMD calls them cores I think. In contrast, Intel's core is an integer pipe plus an FP unit plus extra hardware to let the core switch to another thread when it's not fully occupied with the current thread (hyperthreading). Intel's philosophy is that for lightly-loaded threads (i.e., <70% clock cycles used), it makes sense to switch to another thread and execute it for a while, to keep the core working as near 100% capacity as possible. AMD's BD is taking this one step further and making the second integer pipe much more complete.


BARCELONA WILL HAVE REVERSE HT THAT WILL SPLIT ONE THREAD OVER TWO CORES MAKING IT TEH UBER LEETZ FASTER!!!!!

Nope. No reverse HT.

BD will have it!!!!!!!!!!!1

Probably not since its near impossible....
August 28, 2010 7:54:34 PM

If this were the case, wed be seeing LRB already, there wouldnt need be the new push for multi threading in our schools and businesses, apps, workloads etc etc.
GPUs would be ruling the PC market etc etc
a b à CPUs
August 28, 2010 7:56:42 PM

The way I understand it is that Bulldozer is similar to the processor in the PS3 but modified in a weird way where 2 cores will perform threads but aren't actual 'cores' - they work together but output information as a single core. Maybe I'm wrong.
August 28, 2010 8:18:08 PM

As was sais earlier, the 2 "cores" share other chores with the rest of the module.
Think of it this way, each module is a true core, tho, similar to intels hyperthreading, but dedicating more hardware to it, thus making it a better multi threading operator, tho, on singular threading, its unknown as to how much better it will be over prior gens.
It all depends on workloads
August 28, 2010 8:37:52 PM

Each module is 2 real cores, or a dual core. It's nothing like hyperthreading, this is 2 actual real cores.
a b à CPUs
August 28, 2010 10:17:23 PM

bobdozer said:
Each module is 2 real cores, or a dual core. It's nothing like hyperthreading, this is 2 actual real cores.

No it isn't. A real core has both an ALU and an FPU, and in the BD modules, the ALU is duplicated, but they share an FPU. It's more like a core and a half than two cores.
August 28, 2010 10:33:25 PM

Exactly.
Its always been understood both Int and FP make up a whole core, as they workloads require both to some extent, depending on workload.
fazers answered it most elloquintly in description, the only major unknown to me at this point is the advent of the gpu on die, and how itll affect the FP calls
August 28, 2010 10:42:19 PM

Wait a minute, BD has dual 128 bit FPUs per core, not one shared fpu. However they can be used together to act as a single 256 bit FPU.
a b à CPUs
August 28, 2010 10:45:03 PM

Really? I must be misremembering it then...

Let me look up some references about the bulldozer architecture.
August 28, 2010 10:50:24 PM

Bulldozer is a dual core architecture. It's different but at it's base level it is 2 cores instead of 1.

There is no "single core" with Bulldozer, the lowest part of it is a "module" which has 2 integer cores. There is no hyperthreading involved either, but there could be in future when it is possible for AMD to hyperthread both cores same time.
August 28, 2010 10:51:42 PM

yannifb said:
Wait a minute, BD has dual 128 bit FPUs per core, not one shared fpu. However they can be used together to act as a single 256 bit FPU.

Youre right, tho again, this will effect perf in many workloads
August 28, 2010 10:52:40 PM

Hyperthreading FPU units? heheh
a b à CPUs
August 28, 2010 10:54:08 PM

bobdozer said:
Bulldozer is a dual core architecture. It's different but at it's base level it is 2 cores instead of 1.

There is no "single core" with Bulldozer, the lowest part of it is a "module" which has 2 integer cores. There is no hyperthreading involved either, but there could be in future when it is possible for AMD to hyperthread both cores same time.

Just having 2 ALUs doesn't mean that it has 2 cores. It has roughly a core and a half, as I described above. They wouldn't hyperthread it, as there are already shared resources between the two partial cores in each module. Attempting to hyperthread it would only cause the threads to compete for resources, causing all of them to slow down. Hyperthreading works when you have unutilized execution units, which is a problem the Bulldozer architecture is meant to fix. Hyperthreading is a completely different solution to the same problem, which involves less performance gain, but also less additional silicon.

(Actually, one way to look at it is that Bulldozer has two separate cores for integer use in each module, but it in effect "hyperthreads" floating point operations within each module, sending both threads to one FP scheduler)
August 28, 2010 10:55:27 PM

cjl said:
Looking at the core architecture, it does have a single FPU consisting of a single FP scheduler, and dual 128 bit FMAC pipes.

http://www.anandtech.com/Gallery/Album/754#9

Oh oops. Well at least its a pretty beefy fpu being shared.

On another note, how much of an IPC increase over Phenom II and/or Nehalem do you think BD will bring? After all it could possibly be significant, considering it isnt just a modified k8 like Phenom II was.
August 28, 2010 10:58:44 PM

Some say 17%, others higher
a b à CPUs
August 28, 2010 11:02:05 PM

yannifb said:
Oh oops. Well at least its a pretty beefy fpu being shared.

On another note, how much of an IPC increase over Phenom II and/or Nehalem do you think BD will bring? After all it could possibly be significant, considering it isnt just a modified k8 like Phenom II was.


I hope it's huge, honestly. I'd love to see it actually competitive with Sandy Bridge. More realistically, I'd think that it will be roughly competitive with Nehalem for single threaded on a clock for clock basis, and somewhat better (perhaps even competitive with Sandy Bridge) for multi threaded. That might be optimistic, but I really do hope AMD can get a truly competitive chip out.
August 28, 2010 11:06:46 PM

Thats what I see coming too, not a killer, but depending on price and what you do, possibly a thriller
August 29, 2010 12:47:15 AM

cjl said:
Just having 2 ALUs doesn't mean that it has 2 cores. It has roughly a core and a half, as I described above.


It's 2 cores. Both cores have their own ALU's and AGU's. There is nothing in Bulldozer that makes it a core and a half, a hyperthreaded core or anything else. You cannot make this anything except 2 cores.

http://www.anandtech.com/Gallery/Album/754#5

The green parts in the 8th slide are cores, integer cores = cores.

http://www.anandtech.com/Gallery/Album/754#8
August 29, 2010 12:56:51 AM

The basic building block is the Bulldozer module. AMD calls this a dual-core module because it has two independent integer cores and a single shared floating point core that can service instructions from two independent threads. The two thread machine is larger than a single core but smaller than two cores with straight duplication of resources.
http://www.anandtech.com/show/3863/amd-discloses-bobcat...
a b à CPUs
August 29, 2010 1:20:28 AM

bobdozer said:
It's 2 cores. Both cores have their own ALU's and AGU's. There is nothing in Bulldozer that makes it a core and a half, a hyperthreaded core or anything else. You cannot make this anything except 2 cores.

http://www.anandtech.com/Gallery/Album/754#5

The green parts in the 8th slide are cores, integer cores = cores.

http://www.anandtech.com/Gallery/Album/754#8

Nope. Nice try. The slide you show is two full cores, and it was AMD's starting point for Bulldozer, but it doesn't represent the finalized architecture. Note that in the slide you linked to, each core has its own integer scheduler and integer logic as well as its own floating point scheduler and FPU (with dual 128 bit FMACs).

In Bulldozer, as shown here, the integer logic is fully duplicated, but there is only one floating point scheduler with dual 128 bit FMACs (identical to the FPU in each independent core in your slide) per bulldozer module. This is why it isn't a true two-core scenario for each module.

http://www.anandtech.com/Gallery/Album/754#6

In effect, a bulldozer module will perform exactly like a true dual in a situation involving exclusively integer math, and it will perform exactly like a hyperthreaded single in a situation involving exclusively floating point. In most real scenarios, it will be somewhere in between, hence why I'm calling it effectively a core and a half.
August 29, 2010 1:23:12 AM

Also allows for AVX
August 29, 2010 4:39:59 AM

I have a feeling that BD and SB are going to trade blows IMO. First off BD looks impressive as it is, and since its a 100% new arch we may be underestimating it (or the opposite, lets hope thats not the case). Besides that, look at the link cjl put up. At the end they do mention that single thread performance will be significantly higher, as well as the mention of many new instruction sets which will probably help too.
August 29, 2010 10:12:25 AM

cjl said:
Nope. Nice try. The slide you show is two full cores, and it was AMD's starting point for Bulldozer, but it doesn't represent the finalized architecture. Note that in the slide you linked to, each core has its own integer scheduler and integer logic as well as its own floating point scheduler and FPU (with dual 128 bit FMACs).

In Bulldozer, as shown here, the integer logic is fully duplicated, but there is only one floating point scheduler with dual 128 bit FMACs (identical to the FPU in each independent core in your slide) per bulldozer module. This is why it isn't a true two-core scenario for each module.

http://www.anandtech.com/Gallery/Album/754#6

In effect, a bulldozer module will perform exactly like a true dual in a situation involving exclusively integer math, and it will perform exactly like a hyperthreaded single in a situation involving exclusively floating point. In most real scenarios, it will be somewhere in between, hence why I'm calling it effectively a core and a half.


It's two cores per module. Really.

http://blogs.amd.com/work/2010/08/23/%E2%80%9Dbulldozer...

Quote:
When we talk about cores we will always be using the most agreed upon definition of cores – the integer logic. Today most workloads are integer with a much smaller portion being floating point. This is why we focused on integer cores as the most logical way to define a core.

Each integer core will be able to run one software thread, and these threads can all be done simultaneously, unlike an SMT-type technology that lets two threads share one core.
a b à CPUs
August 29, 2010 7:21:09 PM

That's because AMD would love to advertise as many cores as possible. All current x86 and x64 cores have their own, fully independent FPU. Therefore, AMD's modules do not have two cores. They do duplicate significantly more core logic than Intel does for hyperthreading, so one module will perform faster than a hyperthreaded core, but slower than two genuinely independent cores.
a b à CPUs
August 29, 2010 7:23:09 PM

^ Seems to me that's exactly what cjl has been saying all along - two full integer pipes means a BD module can execute 2 integer threads simultaneously, without switching from one thread to another like Intel does with SMT. But they're not full cores because they share resources like decoders, etc. in one module. So if you compare it to Intel, one AMD module is one Intel core.

I predict this new "module vs. core" naming scheme is going to cause much confusion and lots of flaming over the next 12-24 months :D .
Anonymous
a b à CPUs
August 29, 2010 7:53:35 PM

Yea but if thats the case whats the point of the modular design? If a module runs faster than 1 core with SMT but slower than 2 full cores, and BD is 8 core/4 module, doesn't that translate to being like an intel quad core with faster SMT? If so, an 8 core BD vs an 8 core SB would be bad... as the 8 core BD would act like a 4 core SMT CPU but with faster SMT. Why trade real core performance for beefier SMT thats overall slower?
Am I just misunderstanding the way your explaining it?
August 29, 2010 8:03:52 PM

If one AMD module is one intel core, where is the double integer pipeline in the intel core?

If you think they aren't full cores, then that must mean that a single bulldozer "thread" is almost twice a normal intel core, as one thread has access to all the shared components.

There won't be any confusion because AMD won't be marketing modules, just cores.
Anonymous
a b à CPUs
August 29, 2010 8:13:42 PM

" All current x86 and x64 cores have their own, fully independent FPU. Therefore, AMD's modules do not have two cores. They do duplicate significantly more core logic than Intel does for hyperthreading, so one module will perform faster than a hyperthreaded core, but slower than two genuinely independent cores."

I think the shared resources and duplicated core logic is to reduce redundancy and make core to core communication faster. There WILL be 2 cores, and they will be full cores. As like I said before, if each module acts as a core, than the 4 module BD which is supposed to be 8 core, would be about the same performance as Intels Quad SB, and intel plans to release SB with more than 4 cores.

I must have taken this wrong, cause the way you sound, it would be like fighting Nehalem with a Phenom II X2 with SMT. That would just be plain stupid on AMD's part. So I hope I got your words mixed up.
a b à CPUs
August 29, 2010 8:16:48 PM

bobdozer said:
If one AMD module is one intel core, where is the double integer pipeline in the intel core?

There isn't one. Hence why I'm calling it a "core and a half".


(I'm starting to feel like a stuck record here)

I'll throw the question right back at you: If one AMD module is two Intel cores, then where is the second FPU in the AMD module? For that matter, where is the second instruction fetch and decode? (I'll ignore the shared L2, since you can have 2 independent cores with a shared L2)
a b à CPUs
August 29, 2010 8:20:01 PM

Quote:
" All current x86 and x64 cores have their own, fully independent FPU. Therefore, AMD's modules do not have two cores. They do duplicate significantly more core logic than Intel does for hyperthreading, so one module will perform faster than a hyperthreaded core, but slower than two genuinely independent cores."

I think the shared resources and duplicated core logic is to reduce redundancy and make core to core communication faster. There WILL be 2 cores, and they will be full cores. As like I said before, if each module acts as a core, than the 4 module BD which is supposed to be 8 core, would be about the same performance as Intels Quad SB, and intel plans to release SB with more than 4 cores.

I must have taken this wrong, cause the way you sound, it would be like fighting Nehalem with a Phenom II X2 with SMT. That would just be plain stupid on AMD's part. So I hope I got your words mixed up.



Nope. Full cores include an FPU. AMD's modules have two integer units, but only one FPU. Therefore, they are somewhat in between the performance of a full core and a dual core. So, I'd expect a 4 module BD to perform better than a 4 core chip with the same ALU and FPU as BD, and I'd expect it to perform worse than a true 8 core with the same FPUs and ALUs. I'm intentionally not comparing it to sandy bridge here because sandy bridge uses very different ALUs and FPUs than BD, so it's difficult to compare them without actual test numbers on each (which we don't have).
August 29, 2010 8:21:00 PM

Lets get to the core of this discussion here, so we wont have to repeat ourselves, meanwhile, carry on...
Anonymous
a b à CPUs
August 29, 2010 8:33:00 PM

"If nothing else, Bulldozer should have very good floating-point performance. AMD claims that since the FPU is one of the shared parts of the machine, engineers could beef it up because the cost of the additional hardware is amortized over the two threads. Given enough memory bandwidth, this chip may be a floating-point monster. "
http://arstechnica.com/business/news/2010/08/evolution-...

Nothing definitive by any means as they can only speculate, but its possible. This makes the single FPU per module point not that important don't you think? At least IF they can beef it up to balance it out.

BD is just so exciting to me as theres just so many things that we don't understand, and so many things that can be revolutionary, or a huge flop. I guess we should just wait and see, but its just so hard to hold your tongue heh
a b à CPUs
August 29, 2010 8:34:14 PM

Quote:
"If nothing else, Bulldozer should have very good floating-point performance. AMD claims that since the FPU is one of the shared parts of the machine, engineers could beef it up because the cost of the additional hardware is amortized over the two threads. Given enough memory bandwidth, this chip may be a floating-point monster. "
http://arstechnica.com/business/news/2010/08/evolution-...

Nothing definitive by any means as they can only speculate, but its possible. This makes the single FPU per module point not that important don't you think? At least IF they can beef it up to balance it out.

BD is just so exciting to me as theres just so many things that we don't understand, and so many things that can be revolutionary, or a huge flop. I guess we should just wait and see, but its just so hard to hold your tongue heh

I agree that it's a fairly beefy FPU, but it still isn't as powerful as two independent ones. Still, I am quite optimistic about Bulldozer's performance, and I'd love to see it competitive against Sandy Bridge.
August 29, 2010 8:39:28 PM

cjl said:
I'll throw the question right back at you: If one AMD module is two Intel cores, then where is the second FPU in the AMD module? For that matter, where is the second instruction fetch and decode? (I'll ignore the shared L2, since you can have 2 independent cores with a shared L2)


They are shared amongst the 2 cores. The whole point of the architecture is to share the most common aspects while boosting them enough that hardly anything is lost.

It's all about the die space that can be saved doing this, while losing as little performance as possible. While a "4 core" Bulldozer chip should lose slightly to a 4 core Sandy Bridge chip (as it will still be 4 threads vs 8 threads), the Sandy Bridge chip will be almost double the size so the true metric would be an "8 core" (8 integer units) Bulldozer vs a 4 core Sandy Bridge.

That would be 8 threads vs 8 threads at similar die space and probably very similar TDP. The reason Bulldozer will slaughter Sandy Bridge is the use of 8 real integer units compared to Sandy Bridges 4 real integer units and 4 "fake" from hyperthreading.

None of these are "cores and a half". The average performance increase from Hyperthreading is 20%, and Bulldozers extra real int core supposedly gives 80% increase. They are both as far from being a "core and a half" as each other, the intel method is much closer to a single core performance and the AMD method much closer to a dual core performance.
a b à CPUs
August 29, 2010 8:41:56 PM

Quote:
Yea but if thats the case whats the point of the modular design? If a module runs faster than 1 core with SMT but slower than 2 full cores, and BD is 8 core/4 module, doesn't that translate to being like an intel quad core with faster SMT? If so, an 8 core BD vs an 8 core SB would be bad... as the 8 core BD would act like a 4 core SMT CPU but with faster SMT. Why trade real core performance for beefier SMT thats overall slower?
Am I just misunderstanding the way your explaining it?


Actually, "modular" means something different than having modules. For example, Intel's Nehalem is said to be modular because its core/uncore architecture is easy to adapt to various scenarios, such as 2 or 4 or 6 core versions.

But to answer your question, adding a full extra core (or AMD module) takes an extra 100% of the die space that a full core uses (not including any uncore changes needed to support that extra core). Adding a full integer pipe uses just 12% extra die space according to AMD. And adding the extra registers and other hardware to a core to let it switch between threads and back again just takes 5% extra die space according to Intel. It's really just a difference in the amount of hardware that AMD and Intel want to throw at multithreaded apps. For some scenarios such as lightly-loaded threads, SMT probably makes the most sense; for multiple heavy integer tasks (like some games), AMD solution will probably be best. And for the ultimate performance (and most expensive) solution, just buy a CPU with more full cores, like Thuban or Westmere.

Anyway, as was said before, this is all speculation until we have both shipping BD and SB CPUs to compare.
a b à CPUs
August 29, 2010 8:45:48 PM

JAYDEEJOHN said:
Lets get to the core of this discussion here, so we wont have to repeat ourselves, meanwhile, carry on...


LOL, and here I was starting to think we were gonna beat this thread to a corpse...

Told ya, this "module vs. core" thing is gonna be a real headache and flamebait topic :p .
a b à CPUs
August 29, 2010 8:45:51 PM

bobdozer said:
They are shared amongst the 2 cores. The whole point of the architecture is to share the most common aspects while boosting them enough that hardly anything is lost.

It's all about the die space that can be saved doing this, while losing as little performance as possible. While a "4 core" Bulldozer chip should lose slightly to a 4 core Sandy Bridge chip (as it will still be 4 threads vs 8 threads), the Sandy Bridge chip will be almost double the size so the true metric would be an "8 core" (8 integer units) Bulldozer vs a 4 core Sandy Bridge.


I love die size proclamations and assumptions. We don't know any of the die sizes or pricing yet on BD, so let's wait and see there.

bobdozer said:

That would be 8 threads vs 8 threads at similar die space and probably very similar TDP. The reason Bulldozer will slaughter Sandy Bridge is the use of 8 real integer units compared to Sandy Bridges 4 real integer units and 4 "fake" from hyperthreading.

None of these are "cores and a half". The average performance increase from Hyperthreading is 20%, and Bulldozers extra real int core supposedly gives 80% increase. They are both as far from being a "core and a half" as each other, the intel method is much closer to a single core performance and the AMD method much closer to a dual core performance.

As I said before, they are cores and a half, in essence, since they duplicate some parts of the core while leaving others alone. I really don't know what's so difficult to understand about that. Also, as I said above, it will perform very close to a true dual on integer-heavy operations, and very close to a hyperthreaded single on floating point heavy operations.
Anonymous
a b à CPUs
August 29, 2010 8:57:39 PM

So because of the way AMD is implementing their "modules", even though they are sharing resources, they are doing it in a way that negates the loss of power, or trying to at least. (Is that it in a nutshell?)
Ok, I know a true dual core does not mean 2x performance, but how close really? If the BD module is supposed to be 1.8, and 2 full cores aren't 2.0, it should be a very small difference indeed, if the 1.8 target can be attained.
Oh I just noticed something, I've been thinking under the assumption that SB will be over quad at launch... but it won't. So I was thinking 8 cores with SMT (16 threads) vs BD with 8 cores/8 threads. So the thought of a module performing lower than 2 full cores freaked me out. Dunno why I was thinking 8 core/16thread SB for desktop, I knew better than that. Although I've spent very little time reading about an Intel product. (Not because I'm an AMD fanboy, I'm not, but because I despise Intels business practices which hurt us, the consumers.)
August 29, 2010 8:57:44 PM

cjl said:
No it isn't. A real core has both an ALU and an FPU, and in the BD modules, the ALU is duplicated, but they share an FPU. It's more like a core and a half than two cores.


Each integer core can have its own 128-bit FPU if it needs it.

The concept that a core must have ALU and FPU ignores years of processors without an integrated FPU.

Since ~90% of the workload is integer and ~10% is FP, I would think that most customers are more concerned about having more integer execution pipelines.
August 29, 2010 9:12:54 PM

jf-amd said:
Each integer core can have its own 128-bit FPU if it needs it.

The concept that a core must have ALU and FPU ignores years of processors without an integrated FPU.

Since ~90% of the workload is integer and ~10% is FP, I would think that most customers are more concerned about having more integer execution pipelines.

Based on Sandy Bridge performance numbers, will Bulldozer be competitive with it?
a b à CPUs
August 29, 2010 9:14:17 PM

jf-amd said:
Each integer core can have its own 128-bit FPU if it needs it.

The concept that a core must have ALU and FPU ignores years of processors without an integrated FPU.

Since ~90% of the workload is integer and ~10% is FP, I would think that most customers are more concerned about having more integer execution pipelines.

It does ignore that, yes. It's completely true that cores without integrated fp have been done many times. However, modern x86 and x64 cores do have both integer and FP, and to have a core that you can compare with an Intel (or a previous AMD) core, you have to look at both integer and FP performance.

Now, I agree that for many workloads, this approach looks like it has a lot of potential, and I'm really looking forward to seeing its performance. I am very optimistic about BD, and I can't wait to see benchmarks and actual data. I just don't think you can call it an "8-core" processor completely accurately if it has 4 modules, since an 8-core of any other architecture has 8 FPUs in it while the BD would only have 4.

It all comes down to how you define a core. I'm trying to keep a definition that's as consistent as possible with other x64 processors.
August 29, 2010 9:18:30 PM

cjl said:
It does ignore that, yes. It's completely true that cores without integrated fp have been done many times. However, modern x86 and x64 cores do have both integer and FP, and to have a core that you can compare with an Intel (or a previous AMD) core, you have to look at both integer and FP performance.

Now, I agree that for many workloads, this approach looks like it has a lot of potential, and I'm really looking forward to seeing its performance. I am very optimistic about BD, and I can't wait to see benchmarks and actual data. I just don't think you can call it an "8-core" processor completely accurately if it has 4 modules, since an 8-core of any other architecture has 8 FPUs in it while the BD would only have 4.

It all comes down to how you define a core. I'm trying to keep a definition that's as consistent as possible with other x64 processors.

But you have to admit with BD the FPU is a bit... odd. I mean like we said before it does have 2 128 bit FMACS per module, so i guess 8 cores for four modules is alright.
Anonymous
a b à CPUs
August 29, 2010 9:20:58 PM

I forgot about the 2x 128bit FPU or the 1x 256bit. I get it now... So when I read that a BD module was essentially a quad core hardware enabled single core with CMT allowing performance of up to 1.8 per core rather than intels 1.2 per core SMT, it was true. If you had to put it in the simplest way possible, would that be somewhat accurate?
a b à CPUs
August 29, 2010 9:21:12 PM

yannifb said:
But you have to admit with BD the FPU is a bit... odd. I mean like we said before it does have 2 128 bit FMACS per module, so i guess 8 cores for four modules is alright.

Agreed. I'm not sure how I would advertise it, since I do agree that calling a 4 module BD "4 core" would also be wrong.

You could just advertise by thread count...
!