I think it's going to be quite workload dependent. For instance, DSP code that already has wide instruction-level parallelism and is tuned to make effective use of the native hardware might even be a bit slower, due to increased cache misses from their scheduler chopping up & migrating the threads around.
That said, I think it's clever to try to intelligently pair complementary threads on SMT cores. I think Intel, AMD, ARM, etc. should try to add some analysis capabilities to their hardware, to enable OS thread schedulers to do similar. Of course, I don't expect to see them chop threads into threadlets, but compilers could certainly do that.
In other words, they have some neat ideas, but all implementable without the need for a "Virtual Core" abstraction layer. In fact, the biggest benefit from the "Virtual Core" construct comes from the capability it gives them to do these things on existing CPUs. So, I was a bit surprised to see them building custom silicon. Perhaps they're going to do something radical, like a transport triggered architecture. Something really wide, simple, and highly-dependent on good profile-driven optimization.