tomjgum1

Distinguished
Sep 15, 2010
123
0
18,690
I had an i7 quad (Bloomfield) and my friend is getting a AMD FX 8 core

He brags that it has 8 physical cores not like mine w/ just 4 and 8 threads

i said NO :non: :non: :non: , am i right or wrong :??: ??
 
Show him this: http://www.anandtech.com/bench/Product/434?vs=47

The worst i7 Bloomfield (920) against a 8150. The 8150 beats it in a good many things, but the i7 920 keeps up with it in a lot of things and even beats it in a few things too.

If you go up to the i7 940, things even up even more. http://www.anandtech.com/bench/Product/434?vs=46

And a 950 pretty much takes away any advantage that the 8150 has. http://www.anandtech.com/bench/Product/434?vs=100

The fact is that a 8120/50 is only really good for highly threaded uses.
 


AMD and Intel cores are not the same. In terms of Integer throughput, an Intel Sandybridge core will execute and retire 2.5 times as many instructions per clock cycle per core as an equivalently clocked AMD Bulldozer core. This means that even in the best case scenario, an 8 core Bulldozer CPU still falls behind a 4 core Intel CPU when clocked identically. Approximately 10 bulldozer cores (5 modules) are necessary to match clock-for-clock performance, or the clock speeds must be increased.

Nehalem is a bit older than Sandybridge but it still executes approximately twice as many integer instructions per core per clock cycle as Bulldozer. Since programming is context sensitive, most applications benefit from a single beefy core than a number of weaker cores in parallel.
 
It is technically true that his processor has 8 physical cores and yours has only four. Performance is all that matters, though (see above).
Pinhedd, can I see a source for that 2.5x IPC claim? I don't think it's accurate. Intel CPUs are definitely ahead in IPC right now, but not 2.5x.
 
The extra cores is great... if you are using programs that can make use of all of them.

However, the cores are not independent of each other. Every two cores is part of a module and they both share resources such as the L1 cache, fetching, branch prediction, decoding order, and the FPU. Intel CPUs have independent cores which do not share resources with other cores that can create a bottleneck and decrease performance.
 


It's almost exactly 2.5 times

i7-2600 @ 3.4 Ghz (3.5Ghz with turbo, but Dhrystone is pretty hard on the TDP) = 128,000 MIPS

128,000 MIPS / 4 cores = 32,000 MIPS per core

32,000 MIPS per core / 3,400 Mhz = ~9.4 Instructions per core per clock

FX-8150 @ 3.6 Ghz (up to 3.9 Ghz with full turbo, but Dhrystone is pretty hard on the TDP) = 108,000 MIPS

108,000 MIPS / 8 cores = 13,500 MIPS per core

13,500 MIPS per core / 3,600 Mhz = ~3.75 Instructions per core per clock

I'm under the impression that AMD's turbo is more aggressive than Intel's so I'll compute it at 3.9 Ghz as well

13,500 MIPS per core / 3,900 Mhz = ~3.46 Instructions per core per clock

Now we compare the IPC

9.4 / 3.75 = 2.5x

9.4 / 3.46 = 2.7x

So it's arguably somewhere in the range of 2.5 times the IPC, which is in agreement with what we know about the architecture

The MIPS page on Wikipedia has a bigger list and a similar conclusion

http://en.wikipedia.org/wiki/Million_instructions_per_second
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860

the question is which intel cpu do you use for your calculations? sure if you want to make AMD look stupid, go with the I7 and call its 8 threads 4 cores, you get a higher overinflated value as the HT gets its biggest boost from synthetic testing.

sandra%20multimedia.png


40% to be exact for a synthetic test. How often does HT scale 40% in real world apps? none. most often its ~15%

Using the I7 and calling it a "true quad core" is like saying FX is a "true octo core" both are wrong. After all, according to your information, at 2.5 times faster, AMD should never be faster than Intel.

photoshop.png
oops. :bounce:
 


As a computer engineer myself I can't help but agree. MIPS is pretty worthless for comparing dislike architectures but Bulldozer and Sandybridge are not dislike architectures. They both have the same high-level base instruction set and design objectives and thus it's perfectly fair to compare them, especially since they run the same applications. Dhrystone was designed to saturate integer throughput so it's a pretty good indicator of what each architecture is capable of and the numbers provided are fairly close to what each manufacturer has claimed their architecture is capable of. The differences between Sandybridge and Bulldozer are easily reflected in other benchmarks as well.



Hyperthreading only scales in suboptimal applications, which encompasses well over 80% of real world applications. SMT does not physically change the capabilities of the backend, it only allows the backend to be used for multiple contexts at the same time such that execution resources go unused less often. A suboptimal application is going to be suboptimal whether or not it's run on a Sandybridge processor or a Bulldozer processor.

A 4 core 8 thread processor with hyperthreading enabled has the exact same back end execution resources as the same 4 core 4 thread processor with hyperthreading disabled. The fact that it scales with hyperthreading at all indicates that the benchmark code is leaving resources unused and is thus, not an ideal way to test the theoretical limits of the architecture.

This is why I chose the Dhrystone benchmark, it's a very mature benchmark that is designed to saturate integer throughput without being constrained by anything else. This does not expose the real world application performance in a very good way but it does expose the theoretical limits of the architecture and the limits are very close to those claimed by AMD and Intel in their design sheets.
 

ElMoIsEviL

Distinguished



Well this is how I look at it...

Core i7
Your Core i7 Quad is a 4 core part with Hyperthreading allowing it to execute up to 8 threads at any given time. It does so by executing a second thread while the previous thread is still making it's way through the pipeline. Nothing is physically replicated therefore the 4 hyperthreads are nothing more than logical cores and not physical cores. The following image illustrates what Hyperthreading does:
840XEDC.jpg



AMD FX
His AMD FX 8 core CPU is also not an 8 core processor per-say. The AMD FX series is made up of Modules. These modules contain two Execution units (Integer Units) but share the same Floating point units. His AMD FX has four of those modules for a total of 8 Execution Units but with shared FPU resources. The following image illustrates a single AMD FX module:
amd-bulldozer-module-2.jpg
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860

completely missed the point. you can't divide Intel's nubers by 4 on an 8 core test and pretend that its accurate to divide AMD's by 8 just because AMD decided to call it an 8-core cpu.

you either use the I5 or divide the I7 by 8 since it tested 4+4 cores vs 4 modules (8 cores) for AMD. but then Intel would only be 1.25 times better instead of 2.5. Then again, maybe we should just divide Intel's nubers by 2 and AMD's by 16 that way its 10 times faster.

either way at the end of the day, MiPS is just that, Meaningless Indicator of Processor Speed as only a synthetic test even cares.
 

Isn't real-world performance more important than theoretical limits? Intel isn't 2.5x faster than AMD in any real-world benchmarks, as far as I know, so it doesn't really make sense to brag about its performance as such.
 


You clearly have no idea what you are talking about. It is not tested as 4 + 4 cores, it is tested as 4 cores.

Intel's I7 processors have fully independent cores, each with its own front end, back end, L2 cache, instruction cache, and data cache. Disabling Hyperthreading doesn't physically change the execution capacity at all. The micro-ops will simply be decoded from one thread rather than two. The purpose of Hyperthreading is to allow for sub optimal and orthogonal workloads from separate thread domains to share the same backend resources. Since Dhrystone is a very optimized Integer benchmark it does not scale with Hyperthreading because no resources are left unused. Running two Dhrystone workloads on the same processor via SMT will result in each workload taking on average twice as long to complete.

Dhyrstone scales linearly with SMP, it does not scale with SMT. You will get almost the exact same results as 4C/4T as you will with 4C/8T. In fact you might find that Dhrystone SMT scaling could even be negative due to increased cache overhead which would make 4C/4T ever so slightly higher. Go ahead and take a look at some benchmarks.

AMD's Bulldozer processors have 2 cores per module with a shared instruction cache and FPU. Disabling a Bulldozer core disables the front end, back end, and data cache from that core.
 


You're absolutely right, real world performance is where it counts. There are a number of applications where the spread between the Bulldozer cores and Sandybridge cores is between a factor of 1.5 and 2.0. Most of the 2.0+ factor ones are synthetic. Skyrim for example has a nearly 100% lead on an i7-2600 over an FX-8150. It's notorious for having very limited threading and is a heavily CPU bound game.

Remember, this is a core/core comparison, not a processor/processor comparison.
 

tomjgum1

Distinguished
Sep 15, 2010
123
0
18,690
So in this case would a FX relate to an exon due to its threading capabilities?

in that case exon isn't as good for gaming,

and therefore gaming on a FX wouldn't be as good as a i7 or even an (or somewhat equal) i5 for that matter.
 


Gaming on an FX isn't bad if you're playing a game that can break down 2 threads into 4, or 4 threads into 8, without losing performance or increasing the frame pitch. If the game is stuck with 1, 2, or 4 threads then the Sandybridge i5 and i7 processors will pull far ahead due to each core having substantially better integer capabilities within the same domain.
 

tomjgum1

Distinguished
Sep 15, 2010
123
0
18,690


better off with intel then?
 

A processor-to-processor comparison matters much more than a core-to-core, no? Since the cores are not equivalent, as has been discussed, the only benchmark is a fully-enabled 8150 vs. a fully-enabled 2600, and the difference, I think, is even smaller there.
 


Yes. Intel currently offers better solutions at all price points over $150. Hopefully AMD can fix the inefficiencies in their architecture and release something that's more competitive.



A core to core comparison is appropriate when comparing programs that are constrained by the number of cores that they will effectively use at once (such as games). Sure processor to processor shows a better picture for highly vectorizable workloads but right now that really just encompasses media encoding and synthetic benchmarks. The overwhelming majority of every day applications are still suited for 4 hardware threads or less and in that case, the second half of the bulldozer processor might not even exist.
 

noob2222

Distinguished
Nov 19, 2007
2,722
0
20,860

Really? so this is just an illusion that the I7 is faster than the I5?

sandra%20arithmetic.png


http://www.tomshardware.com/reviews/fx-8150-zambezi-bulldozer-990fx,3043-14.html

yea ... the I7 and I5 are the same ... but Im clueless.

it does not scale with Hyperthreading, it just magically makes the numbers higher.

Do your homework before you try to insult someone next time. :non: