Brew yourself a big pot of coffee...
"Prefetching" means getting data from main memory before it's actually needed and storing it in cache memory. Every x86 CPU since the 386 (and maybe before that too) has a bit of circuitry called a "prefetch unit" that runs in the background, scanning both the CPU's internal registers and any cached instructions to determine what the execution unit (the part that actually runs instructions) is likely to need from memory next. It examines the CS and IP registers (Code Segment/Instruction Pointer) to figure out what snippets of code the exec. unit might want to execute next and attempts to get those snippets into fast cache memory ahead of time. It also looks at prefetched snippets of code to see which are going to require data from memory and tries to get that data into cache memory as well.
As for branch prediction...
A CPU tends to execute instructions one after another; it goes through a section of memory in sequence, picking out instructions as it goes along. Every once in a while, though, it has to choose between two paths of execution. In order to go down an alternate path of execution, it has to stop picking out instructions in sequence and start picking them from a completely different location in memory. This is "branching."
This causes a problem for a prefetch unit that's trying to get upcoming instructions into cache memory. If a branch occurs right in the middle of the instructions it currently has cached, then a lot of the instruction caching it's done is wasted effort. Not only that, but the exec. unit is suddenly demanding instructions that aren't in the cache. Suddenly, the exec. unit can run no faster than the memory's paltry 400MHz (or less).
The way a prefetch unit works around this is by taking an educated guess as to whether an instruction sequence will branch. Based on its guess, it will try to cache the instructions it thinks are coming next. It doesn't always guess right, but it guesses correctly often enough to avoid a lot of performance hits. This is "branch prediction."
Even in the best of times, a prefetch unit can't always keep up with the exec. unit; when it doesn't get instructions or data into the cache fast enough for the exec. unit, the exec. unit has to sit and wait. This is a "cache miss."
Basically, the faster a prefetch unit is, and the better its branch prediction is, the better it can keep up with the demands of the exec. unit. Having a good prefetch unit means that slow or high-latency memory doesn't hurt overall performance as much; it also means that the CPU itself is better able to saturate its memory bandwidth. This means that clock-for-clock, the a CPU with a better prefetch unit will probably get more benefit out of DDR memory.
As for what the PR bunnies are calling a "hardware prefetch"...all I can make of that comment is that the prefetch unit will actually be fully hard-wired.
Something that a lot of people don't know about x86 processors is that they're actually not fully "hard-wired;" x86 instructions are not executed directly in the core but are instead broken down into smaller instructions called <i>micro-ops.</i> Essentially there's a little bit of software (or rather firmware) inside of the CPU itself that handles this translation. This isn't an ideal situation for performance, of course; the translation from instructions to micro-ops incurs some latency in the exec. unit. But with an instruction set as complex as the x86 instruction set, this is the only practical way to handle things.
The prefetch unit's job is relatively simple, though, so I suppose it's possible to make it completely hard-wired. Maybe this is what AMD is doing with the Palomino?
As for whether to wait or not...well, really cool things are always just around the corner. The Palomino sounds like it's going to be a lot more than just a MHz increase; I'd say it ranks right up there with the transition from Slot A's to T-birds, it just won't require you to change your mobo.
I got impatient and ordered an AXIA T-bird with some DDR memory...I figure as cheap as good CPUs are right now, I might as well splurge a bit
Kelledin
<A HREF="http://kelledin.tripod.com/scovsms.jpg" target="_new">http://kelledin.tripod.com/scovsms.jpg</A>