What Makes Athlon Clock Higher Then P3?

IIB

Distinguished
Dec 2, 2001
417
0
18,780
both Athlon And P3 have 10 stage pipelines and 0.18 Transistors... so how can
the athlon clock so much higher then p3?...
what else (else then pipeline and die size) can make a cpu Clock Higher?

Thnk You
oh and - Hello evryone (im new here)
im IIB (just a nickname) from Israel.
 

lhgpoobaa

Illustrious
Dec 31, 2007
14,462
1
40,780
mostly core design. efficiency/arrangement of transistors.

such redesgin/optimisation breathed new life into the 0.18 athlon line with the XP.
20% more efficient at the same clock speed, 20% less heat, and other tweaks & enhancements such as hardware prefetch.





Excuse me for a moment. I need to drive my ergonomic wheely chair over a sheet of bubble wrap!
 

IIB

Distinguished
Dec 2, 2001
417
0
18,780
Oh... so I guess there is no single answer to this one...
whas a hardware prefech? somthing to do with l2 chache prediction hit rate?
of brench tree prediction???

and why intel went ahead with a 20 stage pipleline? you dont
really need 20 stages to get clock rates from 1-2Ghz
and it also makes IPC go down the flames... cant they just redisign the
p3? or make a new chip with 10 < 20 stage pipleine... coz 20 just seems
to me way to many for 1-2GHZ clock speed.... you'll need somekind of
super brench prediction to go with 20 stages in my opinon...
 

lhgpoobaa

Illustrious
Dec 31, 2007
14,462
1
40,780
ummm far as i can figure out hardware prefetch gets needed in advance. (lamers explination)

and the p4 can achieve much higher Mhz still with the 0.18 micron process cauz of its very large pipeline, and its actually doing less instructions per Mhz.

i think what intel was looking for when first designing the p4 was amazing high Mhz.
and a long pipeline would achieve this.


Excuse me for a moment. I need to drive my ergonomic wheely chair over a sheet of bubble wrap!
 

Era

Distinguished
Apr 27, 2001
505
0
18,980
The prefetch logic keeps track of the history of already executed conditional branches and it makes more or less intelligent guess of the outcome of conditional branchs instructions entering the pipeline.If the logig desides the branch to be taken,it fetches the upcoming instructions from ram and puts them in both L1 and L2 cashes.
If it's right(for over 95% of the time it's correct)there will be no empty "bubbles" in instruction stream.

There are many guesses why P4 is so slow per clock.
Too small L1 instruction-cache(it's very efficient and clever,but small)?
Too small and inefficient L1 data-cache?
Too few IU units?
Inefficient FPU(when it can't utilize SSE/SSE2)?
I myself,I don't know,but there sure are a lot of people making guesses of why P4 doesn't perform so well.

<P ID="edit"><FONT SIZE=-1><EM>Edited by Era on 12/02/01 01:48 PM.</EM></FONT></P>