The short answer is it depends...
If the App is correctly written using threads the Kernel can allocate those threads to the available cores; if it has been written as a single thread (YUK!) the Kernel won't be able to spread it around.
Traditionally Linux has been far less thread-oriented than Windows so I doubt that, for single applications, you'll find the performance enhancements any better in Linux than Windows. If anything I would expect the opposite.
I only have a quad-core processor but, as far as I am concerned, the benefit is not to allow a single program to operate more efficiently but to allow a number of programs to run at the same time all getting a decent share of processor time. In this respect you will get better performance in Linux because of the design philosophy of many small programs communicating with each other rather than one monolithic program.
I've been fiddling a bit with transcoding DVDs and have been surprised to find that this workload loads both processors on a Core 2 Duo and Athlon X2. I don't know how far this scales past two processors. You could look into the apps you have in mind to get more specific answers.
Except for transcoding, the only other workload I'm familiar with that uses multiple cores is big S/W builds like kernel compiles. I haven't done that in a long time, but back when I ran a system with dual Pentium Pros, the right command line options when building a kernel would load up both cores.
3D rendering is also very scalable, particularly ray tracing, since every ray is completely independent of the others and therefore no threads need to wait for other threads to finish before continuing.