Sign in with
Sign up | Sign in
Your question

Testing OS with dual and Quad CPU, why does the performance not double

Last response: in CPUs
Share
November 15, 2012 4:33:13 PM

Hi

I have been testing windows xp, vista, 7 and 8 video encoding the same file using Handbrake with both an Core2 E6600 and a Q6600, which are identical (cache, speed etc) apart from the cores (dual and quad), here are my results

OS total time (secs)
XP Dual
375
Vista Dual
394
7 Dual
384
8 Dual
383
XP Quad
267
28.80% increase
Vista Quad
272
30.96% increase
7 Quad
269
29.95% increase
8 Quad
267
30.29% increase


As you can see the increase in performance is around 30% for all OSs. could someone if they know please explain why this would not have doubled or more (all cores were used in each test) I don't understand

Thanks
Dom
a b à CPUs
November 15, 2012 4:43:59 PM

Well 2 things system overhead and drive speed. If the drive cannot get the data to the cpu as fast as it is encoding then there is a bottleneck and it will never get faster than this.

Thent
a c 156 à CPUs
November 15, 2012 4:54:13 PM

thently said:
Well 2 things system overhead and drive speed. If the drive cannot get the data to the cpu as fast as it is encoding then there is a bottleneck and it will never get faster than this.

Thent

Also, the increases gains (or even lost in some cases) are not linear. Multiple considerations (memory throughput, chipset limitations, etc) can affect the outcomes/comparisons like this.

Keep in mind this, the more cores the system needs to manage, the more overhead required to synch those processes. It is not a simple progression to measure without looking at all contributing factors to a system's performance.
Related resources
a b à CPUs
November 15, 2012 5:00:45 PM

First, let's go over the expectations.
If you only doubled the processors, why would you think performance would increase by MORE than 50%?

In the best case, with perfect efficiency everywhere, you get a 50% increase.

There is overhead in getting 4 workers to cooperate on a single job over 2wokers over 1 worker. or a poster said above you, maybe you are maxed out and have a bottleneck.

Worst case is you 0% gain. It is actually possible to get negative "improvement" if the overhead to coordinate the work takes more time than the work itself.

But you got 30%. That's a pretty good boost.

Just as a first tip, and an example of complexity and overhead,
I bet CPU affinity is not set for the threads. So the 4 threads that are running the parallel work are getting swapped around to all your the cores. I bet this was also true in your dual core test too... But swapping 4 threads about is more complex then swapping 2things around.

I suggest if you really want to get into it, try to find the Support or Forum for your particular app (handbrake) and ask about the expectations for multiprocessing performance there.
a b à CPUs
November 15, 2012 5:03:58 PM

Two key functions of any operating system are Memory Management, and Scheduling. When you go from two cores to four, obviously the operating system's overhead at least doubles because now it has to schedule four cores instead of two, and assign threads to twice as many cores. It manages the same memory, but it now has four cores accessing programs in memory instead of just two. And of course the storage subsystem now has four cores that can simultaneously attempt to access the file system on the hard drive, instead of just two. And no matter how many parallel threads Handbrake may spawn, they will have to wait their turn to read and write the hard drive, which is by far the slowest part of any PC.
a b à CPUs
November 15, 2012 5:07:37 PM

Yeah, what raytseng said about threads being swapped around instead of staying with one core, that's a great point.
November 15, 2012 8:47:16 PM

Thanks to everyone that gave their input here, I have a much better understanding of the gains from dual to core, and now i wonder why I didn't consider this stuff myself, makes a lot of sense, really appreciate all the useful input :wahoo: 
a b à CPUs
November 15, 2012 9:45:32 PM

Try from a RAM drive or at least 2 hard drives. 1 for input, 1 for output.

Or a network share for input, local HD output, or reverse that.
November 15, 2012 10:14:11 PM

raytseng said:
First, let's go over the expectations.
If you only doubled the processors, why would you think performance would increase by MORE than 50%?

In the best case, with perfect efficiency everywhere, you get a 50% increase.


Wait, what kind of maths is this? A perfect doubling of the processors would result in a doubling of the throughput, a 100% increase. You'd have a total of 200% of the original.
And to OP, everyone has really summed it up; your test assumes the processor is the bottleneck and that the system has no overheads. Neither of these conditions is likely to be true.
November 15, 2012 10:14:12 PM

raytseng said:
First, let's go over the expectations.
If you only doubled the processors, why would you think performance would increase by MORE than 50%?

In the best case, with perfect efficiency everywhere, you get a 50% increase.


Wait, what kind of maths is this? A perfect doubling of the processors would result in a doubling of the throughput, a 100% increase. You'd have a total of 200% of the original.
And to OP, everyone has really summed it up; your test assumes the processor is the bottleneck and that the system has no overheads. Neither of these conditions is likely to be true.
November 15, 2012 10:14:18 PM

raytseng said:
First, let's go over the expectations.
If you only doubled the processors, why would you think performance would increase by MORE than 50%?

In the best case, with perfect efficiency everywhere, you get a 50% increase.


Wait, what kind of maths is this? A perfect doubling of the processors would result in a doubling of the throughput, a 100% increase. You'd have a total of 200% of the original.
And to OP, everyone has really summed it up; your test assumes the processor is the bottleneck and that the system has no overheads. Neither of these conditions is likely to be true.
a b à CPUs
November 16, 2012 7:22:57 PM

i was trying to speak the OP's language, which I agree is a completely skewed usage of the words. But I made my text match the OP's calculations so he would understand easiest.

His metric of performance improvement is a % time decrease as compared to the dual core. This is similar to a %-off sale in Retail.
It is not a metric of increased throughput. you need to take the inverse to get the throughput increase (1/N)

So using OP's definition of "Performance"
Doubling processors= expected "performance" gain is 50%-off
If you had quaded the number of processors= expected "performance" gain of 75%-off

It is impossible to get 100% "performance" gain using the OP's metric which would mean the job finished in 0 seconds (like a 100% off sale).

Don't get me started on why miles/gallon doesn't work in comparisons versus gallons/100miles.



a b à CPUs
November 16, 2012 7:57:16 PM

Your test bencmark was not fully multithreaded. If it was, you'd get roughly double the performance.

Anandtech doesn't have benches for the E6600, but the have the e6550 and q6600 (so dual 2.33 vs quad 2.4) and you can see the heavily threaded applications get about double the score. The reason it isn't perfectly double is because the OS takes a portion of the first core and as mentioned above, overhead is involved in scheduling.

http://www.anandtech.com/bench/Product/61?vs=53

look at Cinebench multithreaded, 5200 vs 9700
Pov-Ray, 975 vs 1996.
x264 scores are roughly double.

a b à CPUs
November 17, 2012 6:03:43 AM

When you operate at 100% efficiency, you can get 90% improvements.

But as has been noted, most software doesn't stress a CPU to 100%, let alone multiple cores. So you typically won't see most software get that much performance gain.

There's also overhead to consider: HDD's, memory access times, OS thread locks, and so on, so you'll NEVER get perfect 100% scaling.
!