The flops listed are DP, and thats up 8x, plus 2.13 more shaders, I think, plus all the error corrections and approach, and the much improved,larger cache
All that and it still cannot run a large portion of the C++, Fortran and C library.
Intel took a slightly different approach. Intel went MIMD rather than SIMD with Larrabee. That is to say that Larrabee contains full x86 cores each capable of Multiple Instruction streams and Multiple Data streams. Therefore each core can do multiple things at once (vs. SIMD which can only handle a Single Instruction stream per core).
And if you want to push it further RV870 is a Super Scalar SIMD design (Capable of Multiple Threads per Instruction).
Fermi is nothing fancy at all (to be quite honest). It will output around 1.25-1.5TFlops of performance in SP and around 600-677GFLOPS in DP (RV870 can handle 2.72TFLops in SP and 544GFlops in DP).
Fermi isn't as giant of a leap forward as some make it out to be.