The idea is that once the pipeline is loaded, you can clock it really fast. The problem is that when something called "branch prediction" http://en.wikipedia.org/wiki/Branch_predictor
fails (basically when the CPU guesses wrong about which way the program is going to go), you need to flush the pipeline and start over.
tu, the big improvement came with the Conroe Core2's. A Conroe E6600 running at 2.4 GHz is about 1.5 times faster than a 3.46 GHz P4 per core.