In your update to "Dueling Multicores: Intel and AMD Fight For the Future." you ponder:
"how the operating system would pair off threads for the two-threaded tasks"
The thing is that this isn't a new problem. I'm working on a dual Xeon box right now... that has hyperthreading enabled (so the OS sees 4 cpus). So basically that is the same situation that you're talking about.... does the scheduler then split 2 threads across the 2 cpus or will try to mash both of them onto one (which would be slower).
I happen to be using linux... so I can talk about what linux does in this situation.. I'm pretty sure Windows does something similar. When hyperthreading first came out and people tried to use it with their multiprocessor machines (atleast 2 cpus) everyone noted that it was REALLY slow. The reason was that linux saw each hyperthreading "partition" as a physically seperate cpu... so it would just happily schedule threads on the first two execution units it could find (cpu0 and cpu1)... which meant if you were running two processor intensive tasks they might both end up on the same cpu... and run at half speed. Very quickly a patch came out for the scheduler to be a little bit smarter about scheduling with hypterthreading... and now threads get assigned to cpu0 and cpu3 first and then to cpu1 and cpu4 as necessary. This splits the tasks more evenly across the cpus... but actually can still lead to decreases in performance in certain instances.
I am personally a developer... and I run lots of lightweight apps simultaneously. So for me hyperthreading with 2 cpus works out well... I can have 4 apps that are all fairly responsive running at once. A lot of the analysts around here though, choose to turn off hyperthreading so they can run 2 "solves" concurrently and not have the cpus doing anything else and make sure the tasks get split across the cpus properly.
At any rate... I understand that Windows does something similar where it tries to seperate the threads onto different physical cpus first and then fill up the hyperthreading partitions as a last ditch effort to stay responsive. I assume it will continue to do the same thing for dual-cores.
Just thought I would give a little insight. I actually wouldn't mind Tom's doing some benchmarking in these scenarios... you don't even need dual cores to do the benchmarking... just find a couple of hyperthreaded Xeons... turn the HT on and off and show us the numbers.
You are about to answer a thread that has been inactive for more than 6 months. If you still wish to proceed, please ensure that your posting is original and does not duplicate or overlap any prior responses to this thread.