What confuses me about the HT thing is the idea of a multiplier.
(Crap, I just read that link on the page above after writing this)
Read on only if you're interested in the history and a description of Computer Architecture)
Computer Architecture 101:
In the 'old pentium days' the multiplier was the ratio of the system bus' speed vs. the CPU's speed.
The RAM ran at the same speed as the system bus and was directly accessible from that bus.
Bus Speed:
Just to get the basics out of the way.
A bus is a set of data lines and a clock line.
(Sometimes the data lines are divided between data and address lines)
Every time the clock line turns from 0 to 1 (the rising edge), a set of bits can be read off the bus.
(usually, the data is 'set' to the bus when the clock goes from 1 to 0, the falling edge)
The System Bus in PentiumII's was usually between 233 and 300 MHz
The old 386'es ran at 33 MHz.
So did their System Bus
And their RAM
This was all right, for a while...
Bus Multipliers:
This all started with the 486 DX2-66
The system could manage 50 MHz (DX-50), but running the system at 66 MHZ was simply not possible at the time.
The solution:
The CPU ran at an internal speed of 66 MHz and communicated with the rest of the system at 33 MHz.
The 33 MHz bus speed was internally Multiplied to 66 MHz, keeping the CPU 'in sync' with the rest of the system.
If your chip runs at 1GHz and you have a 200 MHz bus, the chip will read the bus clock and internally multiply it by 5 (this is called PLL, or Phase Locked Loop)
This means, that the chip will only be able to communicate with the outside world once every 5 clock cycles.
If it wants to send 5 'blocks' of data, it will take a total of 25 (5x5) clock cycles.
Bus Width
In order to lower that performance impact of waiting 5 cycles for a byte to arrive, designers decided to transmit several bytes at the same time.
This was an expensive (lots of copper traces), but effective trick.
A 64-bit bus means 8 bytes of data per clock cycle)
Latency
In the old world, the memory controller was on the other side of that System Bus.
Getting a block of data took at least 10 cycles (back and forth).
That delay is called latency.
If the CPU ran out of instructions and needs to get the new ones out of RAM, it has to twiddle its thumbs for at least those 10 cycles.
This latency was what prompted AMD to stick the Memory controller (the chip on te CPU that talked to the RAM and handled all the buffering between them) on the CPU and give the CPU a direct connection to the memory controller, eliminating that System Bus.
DDR
DDR (Double Date Rate) was achieved by pushing data on the rising and falling edge of the clock cycle.
This, effectively, doubles the data rate.
How PC3200 DDR RAM gets its name
So, how does 200 MHz memory become 3200?
For starters, the RAM pushes at both the rising and falling edge, producing 400M 'sets of data' per second.
Because the RAM is 64 bits wide, it can transmit 8 Bytes per 'set of data'
This translates to 400*8 = 3200 Bytes per second.
?HyperTransport?
Now, if you run your 1GHz HyperTransport bus at a 5x multiplier, you will only have a 200 MHz bus between your chips (because it gers divided by 5 on the way out).
This would defy the purpose of having a SuperSpeedy bus between your CPU's if it only runs at an effective speed of 200 MHz
So, that's why I want to know how the 5x200 MHz thing works.
My guess is, that the HyperTransport runs at 1GHz between the CPU's and gets divided by 5 when it reaches the RAM, which runs at 200 MHz (DDR 'makes it' 400 MHz)
Side Note:
AMD Opteron CPU's actually have 3 HyperTransport buses:
One to communicate with the System Bus
Two to communicate with eachother
Operons in a multi-CPU machine can literally read eachother's memory at full speed.
Because RAM runs at a fraction of the HyperTransport bus, a CPU can literally read several CPU's worth of RAM simultaneously.