I'm not disagreeing; it comes down to managing resources, and how the OS prioritizes threads. I just find all the talk of "It only uses x CPU's" silly, because using more CPU's does not always lead to more performance.
I know from working with multi-CPU hardware, that any OS on the market (even highly optimized embedded ones) see decreasing performance once you go beyond 32 cores, and I'd imagine that with the way Windows is coded, Windows would see the same brick wall at a much lower core count (I wager 12 cores, but with so little software optimized to take advantage of cores dynamically, plus the difficulty of offloading to different CPU's in Windows period, its hard to really tell...)
To put things in perspective, for my seminar project in college, I made game in OpenGL/C++ (think the first Legend of Zelda). On a Pentium 4 with no hyperthreading, the coding had a support far (and often used) up to 40 threads at once. Threading is NOT hard, and you don't need multiple cores to thread.
For those interested (since I feel like bragging right now
):
All player characters, enemies, and projectiles came from the same base class; the only difference between them was really how they moved (user input vs random vs a set path) and collision detection (what characters take damage upon a collision).
player char + maximum of 9 enemies + up to 3 projetiles on screen per character, each getting an independent thread upon creation = 10+(3*10) = 40 threads at one time (not bad for a 1.6 GHz Pentium 4, huh
).
Just for the sake of running a few tests, I did a REALLY basic recode not long ago where each thread would be put on a different CPU (I simply used a counter, which reset after the last core was used, so threads would be assigned 1-2-3-4-1-2-3-4, etc. Not perfect due to thread destructions leading to an imbalence across the different cores, but good enough for 5 minutes of recoding). Despite the fact I used 4x the cores, I saw no tangable performance benifit (as expected).