To be fair, a few games do scale really well (Bad Company 2 is a perfect example).
I blame a lot of the lack of decent threading for multiple CPU's on a few factors, all CPU independent:
1: Creating a thread is CPU heavy
2: Few programming languages are designed with multiple CPU interaction in mind, and not optimized as such
3: Windows' scheduler, for memory/performance reasons, typically likes to allocate different threads from the same process to the same CPU/core (and since EVERYTHING in windows inherits from a few .sys and DLL files that are active at windows startup, the scheduler thus schedules most every thread for the first CPU/core avaliable)
So the OS, memory management model, programming languages, and finally, the programmer are the biggest reasons massivly parrallel CPU's won't take off.