Why is AMD (Athlon) per clock cycle faster than Intel (P4)?

I would really like to understand how it is
possible that my Pentium-4 1.7Ghz is only 1.12 times
faster than my Athlon 900. Both have more or less
the same amount of memory - both run the same OS
(debian testing - both up to date).

I already posted this in an unrelated thread in the
motherboard section but I guess it really belongs here:

--------------------------------------------------------------------------------
Benchmark:
- Compile time ('time make') of libcwd-0.99.45 after a './configure --enable-maintainer-mode -disable-pch'
compiler: gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
(OS: debian 'testing' (Lenny) at Apr 26, 2007).

System 1:
- model name : AMD Athlon(tm) Processor
cpu MHz : 908.119
cache size : 256 KB
bogomips : 1818.08
MemTotal: 906592 kB
Diskspeed:
Timing cached reads: 272 MB in 2.00 seconds = 135.85 MB/sec
Timing buffered disk reads: 148 MB in 3.02 seconds = 49.04 MB/sec

System 2:
- model name : Intel(R) Pentium(R) 4 CPU 1.70GHz
cpu MHz : 1708.705
cache size : 256 KB
bogomips : 3420.70
MemTotal: 1036664 kB
Diskspeed:
Timing cached reads: 628 MB in 2.00 seconds = 313.39 MB/sec
Timing buffered disk reads: 182 MB in 3.00 seconds = 60.61 MB/sec

'vmstat 1' shows that during compilation 100% cpu is being used,
and both, id(le) and wa(it for IO), are constantly 0. Hence, we are
not measuring diskspeed here - but cpu speed.

Results:

The Althon 900 compiles libcwd in 2 minutes and 5 seconds.
The Pentium-4 1.7 GHz does the same job in 1 minute 53 seconds.

Conclusion: the pentium is only 1.12 times faster, despite that it's nearly
double clock frequency.

What is causing this?

Edit: tried to change the topic (was: Why is AMD faster than Intel?)
22 answers Last reply
More about athlon clock cycle faster intel
  1. dude who cares?

    why is a pinto faster then a chevette? who cares!

    u need to upgrade!


    the p4 takes 33 steps to make 2 calcutaions per cycle while the amd takes 24 steps to make 3! ok!

    now upgrade!
  2. I bet for $100 you could buy something that would eat both those times alive ;)
  3. Because clockspeed isn't an indicator of relative performance between different architectures. When I leave work today I'd rather be driving 40 mph instead of 50 kph. Note that 50 is a bigger number than 40.
  4. Quote:
    Catchy topic no? ;)

    Seriously - I would really like to understand how it is
    possible that my Pentium-4 1.7Ghz is only 1.12 times
    faster than my Athlon 900. Both have more or less
    the same amount of memory - both run the same OS
    (debian testing - both up to date).

    I already posted this in an unrelated thread in the
    motherboard section but I guess it really belongs here:

    --------------------------------------------------------------------------------
    Benchmark:
    - Compile time ('time make') of libcwd-0.99.45 after a './configure --enable-maintainer-mode -disable-pch'
    compiler: gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)
    (OS: debian 'testing' (Lenny) at Apr 26, 2007).

    System 1:
    - model name : AMD Athlon(tm) Processor
    cpu MHz : 908.119
    cache size : 256 KB
    bogomips : 1818.08
    MemTotal: 906592 kB
    Diskspeed:
    Timing cached reads: 272 MB in 2.00 seconds = 135.85 MB/sec
    Timing buffered disk reads: 148 MB in 3.02 seconds = 49.04 MB/sec

    System 2:
    - model name : Intel(R) Pentium(R) 4 CPU 1.70GHz
    cpu MHz : 1708.705
    cache size : 256 KB
    bogomips : 3420.70
    MemTotal: 1036664 kB
    Diskspeed:
    Timing cached reads: 628 MB in 2.00 seconds = 313.39 MB/sec
    Timing buffered disk reads: 182 MB in 3.00 seconds = 60.61 MB/sec

    'vmstat 1' shows that during compilation 100% cpu is being used,
    and both, id(le) and wa(it for IO), are constantly 0. Hence, we are
    not measuring diskspeed here - but cpu speed.

    Results:

    The Althon 900 compiles libcwd in 2 minutes and 5 seconds.
    The Pentium-4 1.7 GHz does the same job in 1 minute 53 seconds.

    Conclusion: the pentium is only 1.12 times faster, despite that it's nearly
    double clock frequency.

    What is causing this?


    First things first... the title of your thread is misleading.

    AMD is not faster then Intel. And Intel is not faster then AMD.

    But to answer your question K7 is faster per clock (higher IPC) then Netburst. K7 = AMD Athlon, Netburst = Intel P4.

    the main difference between these two architectures is that the Netburst architecture (P4) contains very long pipelines. The affect of such long pipelines means that the processor can attain a much higher working frequency, the negative side affect is that it also limits the ability of the processor to do as much work per clock cycle then a competing processor with a shorter pipeline.

    In the end, Netburst defeated K7, albeit doing so with a much higher working frequency. Their last battle was the Athlon XP 3200+ vs. the Pentium 4C 3.2GHz HT. The latter won.
  5. Quote:


    http://www.cpu-world.com/CPUs/Pentium_4/index.html

    How huge is that die lol!
    Get 4 cores in that space these days ;)
  6. Quote:
    you hadnt posted that when i was replying yet 8O damn ,beat to the draw :wink: your explanation is clearer.


    LMAO has Jack explained it, it would have been even clearer yet.. :P
  7. Quote:
    you hadnt posted that when i was replying yet 8O damn ,beat to the draw :wink: your explanation is clearer.
    Better dig out that old copy of "Mavis Beacon...." Vern...slow typing will bury you in these bloodthirsty Forumz. :tongue: :D
  8. Quote:
    you hadnt posted that when i was replying yet 8O damn ,beat to the draw :wink: your explanation is clearer.
    Better dig out that old copy of "Mavis Beacon...." Vern...slow typing will bury you in these bloodthirsty Forumz. :tongue: :D

    I thought the Hundt and pehck method was king here. :Pworks for me. :oops: , but sometimes it's gets my head spinning. :D
  9. Quote:
    AMD is not faster then Intel. And Intel is not faster then AMD.

    But to answer your question K7 is faster per clock (higher IPC) then Netburst. K7 = AMD Athlon, Netburst = Intel P4.

    the main difference between these two architectures is that the Netburst architecture (P4) contains very long pipelines. The affect of such long pipelines means that the processor can attain a much higher working frequency, the negative side affect is that it also limits the ability of the processor to do as much work per clock cycle then a competing processor with a shorter pipeline.

    In the end, Netburst defeated K7, albeit doing so with a much higher working frequency. Their last battle was the Athlon XP 3200+ vs. the Pentium 4C 3.2GHz HT. The latter won.


    Thank you for the excellent answer :D

    Of course I agree with most of the others that these cpu's aren't
    very interesting anymore-- but the reason I asked IS because I
    want to buy a new PC and have to decide between AMD and intel.

    It's a fact that both give cpu frequencies (apart from the number of
    cores) somewhere between 2 and 3 GHz. If Intel still has these long
    pipelines, then should I conclude that a single core at 2.8GHz from Intel
    is a lot slower (for compilation of C++ programs thus) than a
    single core at 2.8GHz from AMD? Probably not or Intel would be out
    of business ;), but then I'd really like to know who changed their
    strategy: If both are now equally fast with the same clock frequency,
    then is that because Intel shortened their pipelines? Or has AMD enlarged
    them?
  10. Quote:
    dude who cares?

    why is a pinto faster then a chevette? who cares!

    u need to upgrade!


    the p4 takes 33 steps to make 2 calcutaions per cycle while the amd takes 24 steps to make 3! ok!

    now upgrade!

    8O I can forgive you for the 12-step pipeline of a K7 Athlon, but an Intel fan like you that does not know the famous Prescott/Cedar Mill pipeline has 31 steps?!
  11. Each processor family is different.
    For example.
    Athlon/AthlonXP/Athlon64/AM2
    Each is progressively faster, and I imagine effecient although originally speed/performance was mostly expressed in mega/gigahertz Mhz/Ghz

    Now you saw Intel/pentiums
    Pentium 1 2 3 4 then a lot of slightly different 4's.
    Then pentium D and now Core 2, next will be Barcelona/Agena/Phenom(name TBD).
    Will the pentium 4's they would re-release under different chipsets and the clockspeed would fall say their max was at 3.2.
    They would release a new model at 2.6 which would beat the 3.2
    How did this work? Efficiency and as mentioned earlier Instructions per cycle(IPC). Imagine wheels and gears. You turn a wheel 500 times, if you turn a really small gear you get less accomplished but if you turn a large gear you get more done per turn. The same can be applied with processer and Hz(Mhz/Ghz).

    Say the pentium 4 had 10 cycles per clock and now the Core 2 has 20 for example(not actual).

    Pentium 4: 10x 3200 32,000 Instructions

    Core 2: 20x2000 40,000 Instructions

    See the Core 2 is faster with a lower clock.

    Now for IPC current looks something like

    Pentium 4<AM2<Core2<Barcelona(K10)???Penryn

    We don't know about K10 or Penryn yet though.
    Safe to assume Barcelona will beat core 2 but we haven't a clue how it will do versus Penryn.

    Things such as the path/steps can determine IPC, but I wouldn't worry about them now. In the end benchmarks, price and personal preferance should be your guide.

    Hope this helps.

    ~Will
  12. AMD/ati systems cost less than intel based especially if you buy the components and assemble them your self
    the performance issues between processors could be made up by your graphics choice, memory, and hard-drive speed.

    Didn't amd help push intel to build a better product?
  13. Quote:

    8O I can forgive you for the 12-step pipeline of a K7 Athlon, but an Intel fan like you that does not know the famous Prescott/Cedar Mill pipeline has 31 steps?!


    The P4 1.7 that the OP has is a Williamette as there were no 1.7 GHz P4s that weren't Willies. There were 1.6A and 1.8A Northwoods, but no 1.7s. The Willy has 20 pipeline stages :D
  14. Yes, and the 20-stage-pipeline P4s have a better IPC than 31-stage ones. altogether, the P4 wouldn't have been that much of a fiasco had they sticked to the 20 stage pipeline; they wanted hyperpipelining and it went right into their a** :D
  15. Quote:
    AMD is not faster then Intel. And Intel is not faster then AMD.

    But to answer your question K7 is faster per clock (higher IPC) then Netburst. K7 = AMD Athlon, Netburst = Intel P4.

    the main difference between these two architectures is that the Netburst architecture (P4) contains very long pipelines. The affect of such long pipelines means that the processor can attain a much higher working frequency, the negative side affect is that it also limits the ability of the processor to do as much work per clock cycle then a competing processor with a shorter pipeline.

    In the end, Netburst defeated K7, albeit doing so with a much higher working frequency. Their last battle was the Athlon XP 3200+ vs. the Pentium 4C 3.2GHz HT. The latter won.


    Thank you for the excellent answer :D

    Of course I agree with most of the others that these cpu's aren't
    very interesting anymore-- but the reason I asked IS because I
    want to buy a new PC and have to decide between AMD and intel.

    It's a fact that both give cpu frequencies (apart from the number of
    cores) somewhere between 2 and 3 GHz. If Intel still has these long
    pipelines, then should I conclude that a single core at 2.8GHz from Intel
    is a lot slower (for compilation of C++ programs thus) than a
    single core at 2.8GHz from AMD? Probably not or Intel would be out
    of business ;), but then I'd really like to know who changed their
    strategy: If both are now equally fast with the same clock frequency,
    then is that because Intel shortened their pipelines? Or has AMD enlarged
    them?

    Comparing P4 versus K7 is not relavent to modern processors. Go look at THG benchmarks of the processors currently out there to get an idea as to how AMD compares to Intel.
  16. Quote:
    This is the reason the Athlon performs better than expected relative to the Pentium (P4) in your benchmarks. The Athlon simply completes or executes more instructions in each tick of the clock compared to the P4.


    So, you are sure it is because the Athlon does more instructions
    in parallel (higher IPC)? Before, I thought it was caused by cache
    misses: because the pipeline is longer, a cache miss is more relevant
    because more 'work' has to be thrown away. So, I thought, the pentium4
    would be better at applications with a "burst" like nature, like video
    processing, and worse when it has to make a lot of fast decisions
    that can't be known before hand (lots of branches).

    If that is the case, that the performance is application dependend,
    then my main problem is (not was) that none of the used benchmarks
    on THG are about compilation. It's about 3D (graphics), which is pure
    bursting data to the video card imho (and a lot of floating point
    calculations of course). The same for mp3 encoding: a lot of calculations,
    but not about (integer type) branches. Then there are several windows
    applications that I have no clue of what they do (I never used windows),
    let alone that I can guess what type of load they are for the processor.

    If THG had benchmarks "compiling blahblah on linux (64 bit); faster is
    better", then I could actually use the benchmarks to make my decision.

    Now, I think: well - those numbers are great. But if the performance
    of the Intel chips fall back a factor of two when I use them to compile
    something, then I don't want to use these results!

    This is the main reason that I asked my question: I hope(d) to understand,
    on the most detailed, technical level what is causing the difference in
    IPC while compiling -- and then I hope to be able to use that knowledge
    to understand how to interpret the current benchmarks.

    Alternatives are:
    1) Someone tells me that 'this or that' benchmark gives
    the same ratios (between each cpu) as compilation does (ie, it is the
    same type of application, and leads to the same IPC).
    2) Someone tells me that the IPC is not (or HARDLY) a function of
    the benchmark/application (this can't be true however, because then
    each benchmark should show the same winner(?)) and the IPC is
    constant.
    3) Someone tells me that whatever the difference was that caused
    the difference that I am observing between the Athlon 900 and the
    P4 does no longer exist, because both architectures now use the same
    approach.

    Finally, contrary to what some people tell me in this thread, it *is*
    important to me to compare the Athlon 900 with the Intel core 2 Extreme
    QX6700 (for example), because I currently own the Athlon 900 and I
    am thinking about buying a new PC: I want to know how much faster
    my new PC will finish the programs that I run. Now I know that it takes
    2 minutes and 5 seconds to run a full 'make' on libcwd-0.99.45 after
    configuration with --enable-maintainer-mode --disable-pch. I won't
    buy a new PC (not worth the money) if the new PC won't be faster
    for exactly that than 25 seconds. If will definitely not wait longer
    and order it this week if I know it will compile it in 12.5 seconds or less.
    At the moment I have NO clue how fast it will be :(
  17. Quote:
    This is the reason the Athlon performs better than expected relative to the Pentium (P4) in your benchmarks. The Athlon simply completes or executes more instructions in each tick of the clock compared to the P4.


    So, you are sure it is because the Athlon does more instructions
    in parallel (higher IPC)? Before, I thought it was caused by cache
    misses: because the pipeline is longer, a cache miss is more relevant
    because more 'work' has to be thrown away. So, I thought, the pentium4
    would be better at applications with a "burst" like nature, like video
    processing, and worse when it has to make a lot of fast decisions
    that can't be known before hand (lots of branches).


    For simple integer instructions, Pentium 4 can do 2 per cycle while Athlon can do 3 per cycle. For complex interger / floating point x87 instructions, Pentium 4 can do 1 per cycle while Athlon can still do 3 per cycle.
  18. Don't yell at me if I'm wrong but if your Athlon 900 does it in 2 mins 5 seconds.
    The Q6700 will probably do it in well under 25 seconds.
  19. Quote:
    So, you are sure it is because the Athlon does more instructions
    in parallel (higher IPC)? Before, I thought it was caused by cache
    misses: because the pipeline is longer, a cache miss is more relevant
    because more 'work' has to be thrown away. So, I thought, the pentium4
    would be better at applications with a "burst" like nature, like video
    processing, and worse when it has to make a lot of fast decisions
    that can't be known before hand (lots of branches).

    If that is the case, that the performance is application dependend,
    then my main problem is (not was) that none of the used benchmarks
    on THG are about compilation. It's about 3D (graphics), which is pure
    bursting data to the video card imho (and a lot of floating point
    calculations of course). The same for mp3 encoding: a lot of calculations,
    but not about (integer type) branches. Then there are several windows
    applications that I have no clue of what they do (I never used windows),
    let alone that I can guess what type of load they are for the processor.

    If THG had benchmarks "compiling blahblah on linux (64 bit); faster is
    better", then I could actually use the benchmarks to make my decision.

    Now, I think: well - those numbers are great. But if the performance
    of the Intel chips fall back a factor of two when I use them to compile
    something, then I don't want to use these results!

    This is the main reason that I asked my question: I hope(d) to understand,
    on the most detailed, technical level what is causing the difference in
    IPC while compiling -- and then I hope to be able to use that knowledge
    to understand how to interpret the current benchmarks.

    Alternatives are:
    1) Someone tells me that 'this or that' benchmark gives
    the same ratios (between each cpu) as compilation does (ie, it is the
    same type of application, and leads to the same IPC).
    2) Someone tells me that the IPC is not (or HARDLY) a function of
    the benchmark/application (this can't be true however, because then
    each benchmark should show the same winner(?)) and the IPC is
    constant.
    3) Someone tells me that whatever the difference was that caused
    the difference that I am observing between the Athlon 900 and the
    P4 does no longer exist, because both architectures now use the same
    approach.

    Finally, contrary to what some people tell me in this thread, it *is*
    important to me to compare the Athlon 900 with the Intel core 2 Extreme
    QX6700 (for example), because I currently own the Athlon 900 and I
    am thinking about buying a new PC: I want to know how much faster
    my new PC will finish the programs that I run. Now I know that it takes
    2 minutes and 5 seconds to run a full 'make' on libcwd-0.99.45 after
    configuration with --enable-maintainer-mode --disable-pch. I won't
    buy a new PC (not worth the money) if the new PC won't be faster
    for exactly that than 25 seconds. If will definitely not wait longer
    and order it this week if I know it will compile it in 12.5 seconds or less.
    At the moment I have NO clue how fast it will be :(


    thats a good point, aleric. i have wondered about compiling benchmarks, too, although i don't have much need for them as i don't program much. so now you know the diff between your Athlon and your P4, and that is interesting knowledge to posses, but the thing is, comparing those processors to current processors is kind of apples-to-oranges.

    there have been quite a few changes in processor tech since the days of those processors, including multiple cores and beefed-up FSBs. due to this fact, you can't really assume that a current AMD equivalent to the Athlon will beat a current intel equivalent to the P4. although the P4 may do 1 less IPC, that fact isn't very helpful when looking to buy a new computer.

    if you are looking to buy right now, i would recommend (without knowing your budget) a dual-core proc from either AMD or intel. your pricerange will depend upon your cash at hand; AMD owns the low end and intel offers some really tempting mids- and- highs.

    i don't have much basis for this statement, but i'm positive that a current proc will own those comp times.
  20. Quote:
    This is a funny comparision.... as of now, the C2D simply out classes the K8 ...


    I mean that if C2D would take more than 25 seconds, then I'll wait
    longer before buying anything. If it will do it in 12.5 seconds then
    I don't have to think longer. If it will do it in 18 seconds, then I'll
    still have to think about spending the $3000 that is my budget for
    (every) new PC, or wait another year to get more for the same money.

    The battle for K8 was already lost before I posted this (as I said in my
    first post) because it is still 90 nm and therefore uses too much power.
    Hmm, or maybe I had said that in another thread... this thread was
    a 'spin off' of that thread :oops:

    I am very happy to know now, thanks to you, that the 65nm Intel chips
    are indeed the fastest chips-- then my decision to buy an intel cpu this
    time is fully justified. It still remains to be seen however if the new
    machine will be 10 times as fast... but I think I'll go for the QX6700
    and just see.

    As an (open source) developer, I should have had a 64 bit OS years
    ago already :/ It's really time to finally get one. Having four cores will
    be fun to play with (developing multi-threaded applications).

    Thank you for all the time you took to answer me in detail!
    Aleric
  21. Does the AMD athelon CPU have difficulty with high resolution graphics for photoshop or games?

    Can the AMD chip do calculation for excel spreadsheet or Oracle computation?

    How well does AMD handle Office 2003/2007 tasks?
  22. Quote:
    you hadnt posted that when i was replying yet 8O damn ,beat to the draw :wink: your explanation is clearer.
    Better dig out that old copy of "Mavis Beacon...." Vern...slow typing will bury you in these bloodthirsty Forumz. :tongue: :D

    w00t! Mavis Beacon! They forced that crap upon us in like 5th grade. I thought it was crap at the time, and continued pounding away with two fingers. Then, about two years later, I decided to give two-handed typing another shot. The rest is history.
Ask a new question

Read More

CPUs Pentium Product