You may remember my article from last year's Microprocessor Forum 'AMD moves onto the overtaking lane '. In this article I already postulated that K7 or Athlon would have a significantly faster FPU than any Intel CPU. Of course I was criticized for that, since it seems one of the rather popular things to do nowadays. Nevertheless Athlon has shown that its FPU is one of its strongest parts. Let's check out why.
The number one reason why Athlon can play in the same ballpark as the Intel CPUs is the fact that Athlon's FPU is now fully pipelined vs. the unpipelined FPU of K6, K6-2 and K6-3. That's not all however. Athlon has got three parallel FP execution units and, as we know from above, the three execution units can be fed at the same time, since each of them has its own port. Pentium III has also got 3 FP execution units, but unfortunately they're all behind one port. What is so great about the Athlon FPU is that it can execute two 80-bit extended operations a clock to Intel's one.
I can still remember the old discussion when K6 came out. People claimed that K6's FPU wouldn't be bad at all, since it had a lower latency than PII. This was right and wrong at the same time. The latency of many K6-FPU instructions is indeed lower than of Pentium Pro and PII, but this is not good enough without the pipelining, especially in software written for Intel CPUs. Athlon's FPU has got an average latency that's also less than PIII's. The result is that with the lower latency, the FPU pipeline and the 3 ports, Athlon can score significantly higher with its FPU than PIII can, which you will see particularly nicely in the results of the 3D Studio Max rendertime. This benchmark used to be Intel's domain, but that time's over now. Anyway, the FPU is good enough for Athlon's performance advantage point No. 4 for the speed and No. 5 for beating Intel on its favorite battleground.