Don't worry about it!
What you see in Task Manager is not the whole story. It shows the processes that are alive in the system, not the threads. Each process can have multiple threads. You can have it show the number of threads by selecting
View->Select Columns. I don't think it is possible to see what each thread is doing, unless (like me) you are a programmer who is using development tools to do so.
Windows may be putting most processes on one CPU because it is a lot more efficient to do so. Windows programs wait for some kind of message from Windows that something occurred, like a mouse click. It then passes those messages to the applications main thread. There are very few main threads alive in your system. The vast majority of processes you see are DLLs, COM objects, Services, etc. that the applications use to perform various tasks. You want them to be on one CPU to make it a lot more efficient for the main threads to call them.
Each main thread (within a process) can also spawn other threads to do various things. Like a game can spawn another thread to load resources from the HD or play the background music. Word uses another thread to do its spell-checking while you type. Photoshop may launch several threads to perform various filters on an image. One of the tests that THG shows is a 3DS MAX operation that uses multiple threads effectively, where the number of CPUs is more important than the clock speed of the CPU (Q6600 is faster than E6850).
So, having a thread running on another CPU is really only valuable when an application process uses multiple threads to do several things at once. The system could create those threads to use the other CPUs, so they can all run concurrently. Performance improves as each thread performs an operation that takes more time.
The benefit to games is actually limited. Only one thread can receive user input. The reason DirectX10 is not doing really well is because it assumes you will use multiple threads to access the various buffers and performs system locks on those buffers when they are accessed.
With DX9 it was optional to create the DX device to use more than one thread. When created to use one thread (the default) no locking was necessary to update the various DirectX managed buffers.
When you want the game to perform at 40+ FPS, you gotta weigh the cost of managing multiple threads versus the benefit of doing several operations concurrently. It is quite a challange. This is why applications that perform long operations benefit the most from multiple CPUs. When you have 25 milliseconds to render each frame, the cost of managing multiple threads becomes expensive. This is why DX10 games are having issues with performance.
I know that Intel is working to improve the performance of their architecture and microcode for multi-threading. Eventually the cost of creating and managing multiple threads will improve. It may require changes to both the chipset and CPU architectures in addition to the microcode. If so, it may also require MS to redo their compilers to use any new features.
For now, most people will benefit more from an E6850 than a Q6600.