I've made my stance on this very clear in the PD forum. For single applications, you are not typically going to see scaling beyond more then a handful of cores, due to a variety of factors.
We don't know your stance you made there. But you can use as many threads as there are processes within the program that need to be computes at once. That's not saying that it will be faster though. At least that's what I've read.