Final Recount: Pentium 4 vs. Athlon

Benchmark Results

FlasK MPEG Once More

I'd like to start by picking up the topic of the last P4-Update article . You remember that Intel supplied us with a special version of the free video compression software FlasK MPEG of Alberto Vigatá that included optimizations for Pentium 4. Intel had done those optimizations to FlasK in only one night and mainly with compiler-work. With the optimized code Pentium 4 was able to leave Athlon behind, once the software was using the new SSE2-enhancements of Pentium 4. However, even the newly compiled x87-IDCT performed better on Pentium 4 than on Athlon, which was very surprising. Usually nothing can touch Athlon's floating-point performance, especially if the common x87 FPU is used.

Of course the comparison between Pentium 4 and Athlon, which is using software that is only optimized by Intel is not able to provide a fair picture of the two processors. That's why I had asked AMD to provide me with a FlasK version that is optimized for Athlon. Alexander Goodrich and Sean Stanek, who are both software engineers that are involved with AMD-optimizations, had already started the whole issue shortly after the publication of my last P4-article. The code that I finally ended up using comes from the two, but it has been looked over by AMD. I was told that AMD would try to provide me with an even more optimized version this week, but that Alex's and Sean's FlasK-optimizations were already too good to be improved by AMD's engineers in a short time. I will publish results with AMD's final FlasK-version as it becomes available, but for the time being we should thank Alex and Sean for their great achievement of quickly providing an Athlon-optimized version of FlasK that AMD was not able to enhance any further within the last week.

Looking at the new picture shows that P4 1.5 GHz is still in the lead when compared to Athlon 1200/133 plus DDR-memory. While the performance of Athlon 1200 and P4 1500 is close to identical once the optimized x87-executable is used, Pentium 4 using its new SSE2-enhancements is way faster than Athlon using 3DNow!. This is rather surprising, since most of us would probably have expected Athlon pulling away from Pentium 4.

Comparing Pentium 4 and Athlon on a clock-for-clock basis (Pentium 1.5 GHz vs. Athlon 1.466 GHz) shows however, that Athlon runs x87 FPU optimized code still faster than Pentium 4. What this comparison also shows is that with FlasK SSE2 is clearly superior to 3DNow!. Please be reminded that this example only focuses on FlasK. There might well be other software that would run a lot better on Athlon than on Pentium 4 once the code is optimized for each.

Bottom line is that Pentium 4 is indeed able to deliver excellent performance once software has been optimized for it. FlasK is an example showing that Athlon cannot always beat Pentium 4, even when the software has special Athlon-optimizations. My estimate of Athlon's inability to beat Pentium 4 in this test is that Athlon runs into bandwidth limitations with this streaming benchmark. In case of MPEG4-encoding Pentium 4's fast quad-pumped 100 MHz bus plus the dual-Rambus channel memory access of i850 seems superior to Athlon's dual-pumped 133 MHz bus and the 133 MHz DDR-SDRAM memory solution of AMD760. That's why even the best optimizations can't give Athlon enough of a boost to overtake Pentium 4.

BAPCo Sysmark 2000 Under Windows 2000 Professional

Sysmark 2000 from BAPco is slowly ageing a bit, but it still represents a good benchmark for current office application performance. We know that Athlon leaves Pentium 4 far behind in Sysmark2000 under Windows 98 and so it is not very likely that it will look too different under Windows 2000.

It is true, Athlon is beating Pentium 4, but the difference is not quite as huge as it was under Windows 98. The overclocked Athlon at 1400/133 and 1466/133 shows how high Sysmark2000 scores can actually go. It leaves everything else in the dust. Pentium 4 can't beat Athlon 1200/133 plus DDR even when it is overclocked to 1728 MHz. It is obvious that office applications are not able to take any advantage of Pentium 4's architecture right now. It is questionable to me how those applications could benefit from Pentium 4 optimization, which is why I doubt that the above picture will change a lot in the future.