Duel of the Titans: Opteron vs. Xeon

Details On The Opteron Core: An Enhanced Athlon, Continued

The really cool stuff is in the details of the chip. At the heart of the CPU is a crossbar switch (XBAR), which manages the data streams between the memory controller, CPU core and the three Hypertransport ports. Compared to the Athlon 64, which is only meant for single-processor operation, the Opteron has a controller logic that allows multi-processor operation. Thus, when used for a server, up to eight Opterons can work together without a Northbridge.

The fundamental structure of the Opteron is not much different from the Athlon: the 3 ALUs and 3 AGUs are now capable of 64 bit wide arithmetic. The caches now have ECC circuitry. A lot more changes are can be found in the details.

Swipe to scroll horizontally
CPU coreHammerBartonThoroughbred "B"
Wafer surface (200 mm diameter)31416 mm²31416 mm²31416 mm²
Die surface193 mm²101 mm²84 mm²
Process technology0.13 µm0.13 µm0.13 µm
Waste (approx)18%18%18%
Yield (theoretical)148 units/wafer255 units/wafer306 units/wafer
Yield (at 60% yield rate)89 units/wafer153 units/wafer183 units/wafer
Swipe to scroll horizontally
CPU coreThoroughbred "A"PalominoThunderbird
Wafer surface (200 mm diameter)31416 mm²31416 mm²31416 mm²
Die surface80 mm²128 mm²128 mm²
Process technology0.13 µm0.18 µm0.18 µm
Waste (approx)18%18%18%
Yield (theoretical)322 units/wafer201 units/wafer201 units/wafer
Yield (at 60% yield rate)193 units/wafer120 units/wafer120 units/wafer

Compared to the Thoroughbred and Barton cores, the TLBs work with a smaller latency time, which, in turn, leads to an increase in speed. The branch prediction was also reworked in that the History Counter stores up to 16,000 entries (Athlon XP - 4,000).

In order to accommodate the 64 bit instruction set, AMD extended the pipeline of the Opteron to 12 stages - AthlonXP has only 10 stages, while the high speed architecture of the Intel P4 (and Xeon) uses 20 stages.