Duel of the Titans: Opteron vs. Xeon
Details On The Opteron Core: An Enhanced Athlon, Continued
The really cool stuff is in the details of the chip. At the heart of the CPU is a crossbar switch (XBAR), which manages the data streams between the memory controller, CPU core and the three Hypertransport ports. Compared to the Athlon 64, which is only meant for single-processor operation, the Opteron has a controller logic that allows multi-processor operation. Thus, when used for a server, up to eight Opterons can work together without a Northbridge.
Furthermore, an SSE2-compatible unit has been added, which has twice the amount of registers (16) as the Intel P4. Fundamental changes are to be found at the command processing level: the Transition Look-aside Buffers (TLB) have been reworked for larger workloads (1000 entries max.). Basically, the more entries in the TLB, the less frequently the translation tables have to be accessed in system memory when transmitting the physical address.
The fundamental structure of the Opteron is not much different from the Athlon: the 3 ALUs and 3 AGUs are now capable of 64 bit wide arithmetic. The caches now have ECC circuitry. A lot more changes are can be found in the details.
CPU core | Hammer | Barton | Thoroughbred "B" |
---|---|---|---|
Wafer surface (200 mm diameter) | 31416 mm² | 31416 mm² | 31416 mm² |
Die surface | 193 mm² | 101 mm² | 84 mm² |
Process technology | 0.13 µm | 0.13 µm | 0.13 µm |
Waste (approx) | 18% | 18% | 18% |
Yield (theoretical) | 148 units/wafer | 255 units/wafer | 306 units/wafer |
Yield (at 60% yield rate) | 89 units/wafer | 153 units/wafer | 183 units/wafer |
CPU core | Thoroughbred "A" | Palomino | Thunderbird |
---|---|---|---|
Wafer surface (200 mm diameter) | 31416 mm² | 31416 mm² | 31416 mm² |
Die surface | 80 mm² | 128 mm² | 128 mm² |
Process technology | 0.13 µm | 0.18 µm | 0.18 µm |
Waste (approx) | 18% | 18% | 18% |
Yield (theoretical) | 322 units/wafer | 201 units/wafer | 201 units/wafer |
Yield (at 60% yield rate) | 193 units/wafer | 120 units/wafer | 120 units/wafer |
Compared to the Thoroughbred and Barton cores, the TLBs work with a smaller latency time, which, in turn, leads to an increase in speed. The branch prediction was also reworked in that the History Counter stores up to 16,000 entries (Athlon XP - 4,000).
In order to accommodate the 64 bit instruction set, AMD extended the pipeline of the Opteron to 12 stages - AthlonXP has only 10 stages, while the high speed architecture of the Intel P4 (and Xeon) uses 20 stages.
Stay On the Cutting Edge: Get the Tom's Hardware Newsletter
Get Tom's Hardware's best news and in-depth reviews, straight to your inbox.
Current page: Details On The Opteron Core: An Enhanced Athlon, Continued
Prev Page Details On The Opteron Core: An Enhanced Athlon, Continued Next Page HyperTransport: A High-Speed Bus Without Detours