The First Level Cache
Well, what can I say, Athlon really kicks butt with its L1-caches. The instruction as well as the data caches are with 64 kB no less than 4 times as big as Pentium III's. This will make sure that Athlon will scale beautifully with clock speed, so that its pipeline and all the execution ports can be fed at 1 GHz just as nicely at 600 MHz. I can't help it, but the L1-cache earns Athlon at least one more performance point, which is No.6 if I'm not mistaken.
Memory Streaming And Write Combining
You may remember Intel's introduction of Pentium III. One of the new features of PIII was the 'streaming instructions'. Those instructions are implemented into Athlon as well. Athlon's 64-byte deep write buffer plus the five streaming instructions makes sure that Athlon is also able to pre-fetch data in a defined way from, into or around the two (L2 and L1) data caches and it can write directly to memory and thus around the caches as well.
Another nice addition is the inclusion of the Memory Type Range Registers of Athlon, which are now compatible to Intel's P6-architecture as well. This enables write-combining, which is particularly useful for writes to the graphic card's frame buffer, as the ones of you may remember, who used 'fastvid' in the earlier days of Pentium Pro. K6-2 CXT and K6-3 were also able to do write combining, but it took a special programming of the video-card driver to make use of it. These days are over with Athlon. Still NT 4 needs a special DLL-file for enabling write combining, and this DLL is supposed to be included into service pack 6 very soon.