Intel's New Weapon: Pentium 4 Prescott

More Cache: 1 MB L2 And 16 kB L1/Data

With the advantage of a very small, circuitry production process of 90 nm, Intel was easily able to increase the L2 cache size. Instead of Northwood's 512 kB, Prescott can now access 1 MB. Regardless of the transistor count, the die size dropped from 146 mm² to 112 mm². At 3.4E GHz, Prescott has a maximum cache bandwidth of 108 GB/s.

Additionally, Intel doubled the L1 data cache from 8 kB to 16 kB. Let's look back to 2000 when Intel launched the Pentium 4 Willamette, with a reduction in the cache size to 8 kB. Back then, the L1 cache had to be reduced to 8 kB in order to keep the latency at two clock cycles. Slower cache access would have worsened the performance gap with the Pentium III even more. Still today, it is very important to have fast caches, since both AGUs (address generation units) need to access it frequently.

More Instructions: SSE3

After Intel's success with the Pentium 4's SSE2 instruction set (Streaming SIMD Extensions, 144 instructions), SSE3 is supposed to be a reaction to the wishes and desires of big software companies. This time, there are only 13 new instructions to make the programmer's life easier:

  • fisttp: fp to int conversion
  • addsubps, addsubpd, movsldup, movshdup, movddup: complex arithmetic
  • lddqu: video encoding
  • haddps, hsubps, haddpd, hsubpd: graphics (SIMD FP / AOS)
  • monitor, mwait: thread synchronization

It remains to be seen what SSE3 will do. We already came across one application that has been tuned to support SSE3: Mainconcept MPEG Encoder 1.4.1 (see benchmark section).