HAMMER scaling ability is scary

http://www.3dcenter.org/artikel/2002/10-18.php
*corrected link*

as seen here... Hammer murders all the other processors in scaling.... add more MHZ... scores rise much higher...

Opteron gains about 17 % in SPEC FPU per 200 MHZ boost
P4/XEON gain about 5 % in SPEC FPU per 200 MHZ boost

this means that if AMD can Ramp the clock speeds...they will be very potent against the next level of p4's

<P ID="edit"><FONT SIZE=-1><EM>Edited by popegoldx on 10/19/02 05:54 PM.</EM></FONT></P>
33 answers Last reply
More about hammer scaling ability scary
  1. That's not a correct link.

    "We are Microsoft, resistance is futile." - Bill Gates, 2015.
  2. Well since I cant see anything other than a case in the right coner Ill just assume your crazy and go about my day.

    -Jeremy

    <font color=blue>Just some advice from your friendly neighborhood blue man </font color=blue> :smile:
  3. I think u mean left Corner Spud =)

    Yea whats with this link?? at least make it <A HREF="http://www.warp2search.net/adserver/adframe.php?n=ad8d2ec1&clientid=70" target="_new">CLICKABLE</A> for everyone to see either the case or the pretty colours!

    <A HREF="http://www.anandtech.com/mysystemrig.html?id=13597" target="_new">-MeTaL RoCkEr</A>
  4. :eek: My bad sorry I'm sleepy today and the store is soo very cold.

    -Jeremy

    <font color=blue>Just some advice from your friendly neighborhood blue man </font color=blue> :smile:
  5. is that you poopy???? maybe you should post a link that makes sense instead of spreading amd fud.
  6. http://www.3dcenter.org/artikel/2002/10-18.php

    there u go dickless... i posted the wrong link... thats the correct one... no AMD FUD

    just NUMBERS
  7. Oh Oh!!! <A HREF="http://www.3dcenter.org/artikel/2002/10-18.php" target="_new">CLICKY CLICKY ONE DOLLA!!</A> =)

    EDIT: I don't speak the language displayed on that page...nor do i know what it is...could sum 1 please translate it for us?

    <A HREF="http://www.anandtech.com/mysystemrig.html?id=13597" target="_new">-MeTaL RoCkEr</A><P ID="edit"><FONT SIZE=-1><EM>Edited by MeTaLrOcKeR on 10/19/02 05:33 PM.</EM></FONT></P>
  8. <A HREF="http://translate.google.com/translate?u=http://www.3dcenter.org/artikel/2002/10-18.php&langpair=de|en&hl=en&ie=UTF-8&oe=UTF-8&prev=/language_tools" target="_new">Oh Oh Oh Clickie Translate $1.50</A>

    Complicated proofs are proofs of confusion.
  9. didnt know graphs and numbers needed translating.
  10. Looking at the Spec databases they linked to, I don't see Opteron scores. I wonder where they got them from.

    "We are Microsoft, resistance is futile." - Bill Gates, 2015.
  11. U alright dude???

    The scores r right on the main page =)

    <A HREF="http://www.anandtech.com/mysystemrig.html?id=13597" target="_new">-MeTaL RoCkEr</A>
  12. Opteron gets beaton in SPECfp 2000? Is that significant. I have no clue what these tests are.

    ...And all the King's horses and all the King's men couldn't put my computer back together again...
  13. I dunno what the tests are....but the Optetron gets beaten in FP.....but it was desinged for ALU...which it more than out-muscles everything else... =)

    <A HREF="http://www.anandtech.com/mysystemrig.html?id=13597" target="_new">-MeTaL RoCkEr</A>
  14. Hmm, the performance is definitly not scaling linearly. Notice the 1.6GHZ to 1.8GHZ, POWERFUL, more than 200 points in Int, but from 1.8 to 2 it only gives ~117. Once again the scaling has problems, I am surprised AMD didn't even look at that!
    Had it been consistent like the 1.6 to 1.8, it would indeed be scary, as 1GHZ would lead a direct 1000 points, completly busting any P4 of any type including Prescott!

    --
    "Let Go." -Avril Lavigne
  15. Strange since the K8 core is using the K7 which was the king of x86 FP. It wasn't beaten by the 2.8GHZ though, so it still maintains the FPU lead in the x86 world. What I'd like is an SSE2 fight.

    --
    "Let Go." -Avril Lavigne
  16. Eden, you've gotta remember this is Opteron, not ClawHammer. If AMD can get Opteron chips down to $500USD, then it'll compete with the Prescott, otherwise, that's just a moot point.

    ...And all the King's horses and all the King's men couldn't put my computer back together again...
  17. Opteron is just the MP name, CH and SH can both be Opterons!
    Athlon DT can also be Sledge but it'll be for small 1-2 way servers.

    And yes it shouldn't go against Prescott, but I said that for the sake of the graphs' comparison chart CPUs. What I find odd is why is the Xeon 2.8GHZ using an NW core, weaker than the P4?! I mean I thought it used a smarter cache design and SMP capabilities,(which the P4 doesn't have anyways) Hyper Threading enab...Ooohhh, maybe that's why...
    --
    "Let Go." -Avril Lavigne<P ID="edit"><FONT SIZE=-1><EM>Edited by Eden on 10/19/02 08:09 PM.</EM></FONT></P>
  18. lol... seems like HT isn't all that great for the XEONs huh?

    ...And all the King's horses and all the King's men couldn't put my computer back together again...
  19. Quote:
    Strange since the K8 core is using the K7 which was the king of x86 FP. It wasn't beaten by the 2.8GHZ though, so it still maintains the FPU lead in the x86 world.

    Looks like you answered your own musings there. :wink: AMD is the king of <i>x87</i> FP, and x87 FP just isn't that great.

    <i>I can love my fellow man...but I'm damned if I'll love yours.</i>
  20. Heheh, I guess then SpecFP must use some FP ops that aren't x86 limited then? (considering IA64 FPUs seem to rape, and only at 1GHZ so you can imagine at 2GHZ!)

    BTW why is the FPU using a x86+1 number, x87?

    --
    "Let Go." -Avril Lavigne<P ID="edit"><FONT SIZE=-1><EM>Edited by Eden on 10/19/02 08:58 PM.</EM></FONT></P>
  21. Quote:
    BTW why is the FPU using a x86+1 number, x87?

    Back in the old days of the 386 and before, Intel CPUs didn't have FPUs built-in, they were sold separately. Motherboards had a second socket for the FPU chip, which was called an x87 (387 for the 386, 287 for the 286, etc). Both chips would then run together simultaneously.

    The 486DX was the first Intel CPU to integrate the FPU into the core. Intel also sold a budget 486SX, which was the 486 core with the FPU removed. You could then buy the 487 unit separately if you decided later that you wanted the FPU. Curiously, the 487 was actually a full-fledged 486 and when you put it on the motherboard it would simply de-activate the 486SX completely. So in this case the two chips would not execute together, the 486SX was actually doing nothing!
  22. But it doesn't explain why they call it with a 7!

    Also, if at their time, with no FPU, how the heck does the CPU possibly live through this? How does it calculate anything decimal?
    It scares me out when thinking of an FPU-less workaround!

    --
    "Let Go." -Avril Lavigne
  23. Quote:
    Also, if at their time, with no FPU, how the heck does the CPU possibly live through this? How does it calculate anything decimal?

    You are young yet, Jedi apprentice.

    Quote:
    It scares me out when thinking of an FPU-less workaround!

    Fear leads to anger. Anger leads to stress. Stress leads to doobies... :wink:

    1) Apps could do IEEE-compliant floating-point operations using generic bit-manipulation techniques and produce the same end results as a numeric coprocessor. Very, very slow.

    2) The O/S could trap the "numeric coprocessor not present" exception and do the work of (1). Again, very, very slow. Advantageous in that apps usually didn't have to account for the possibility of a missing FPU.

    3) Apps could store and manipulate numbers in BCD (Binary Coded Decimal) format. Many developers were doing this anyways, simply because it allowed greater range/precision than most FPUs. Still rather slow.

    <i>I can love my fellow man...but I'm damned if I'll love yours.</i>
  24. Oh great Jedi Master it pains me to mention that you forgot one of the great tools of the early 90s. The fixed point technique. It certainly had its limitations but at one point I had a whole 3d transformation pipeline coded in fixed point.

    Complicated proofs are proofs of confusion.
  25. As other have said, calculations were made using alternative technics. But if you are interested in the "power" of old CPU just select it in Sisoft Sandra and see the numbers. Progression made is just astonishing!


    DIY: read, buy, test, learn, reward yourself!
  26. Just some trivia.
    387 FPU was mutch more expensive then the 386 CPU. IIRC.
  27. Quote:
    Also, if at their time, with no FPU, how the heck does the CPU possibly live through this? How does it calculate anything decimal?

    Well practically all games until Quake didn't use the FPU at all and they ran fine! Not sure whether System Shock or Ultima Underworld used the FPU, but they were both fully 3D games that came out before Quake. Clearly, games with full 3D graphics can be coded to be tremendously faster if they use the FPU, so they all do today.

    Not sure about modern 2D games though (which today are becoming fewer and farther between). Anyone know if games like Age of Wonders 2, Red Alert 2, or HOMM4 use the FPU? I know they often use MMX or SSE.

    Ritesh
  28. Yes, but they're incredibly inaccurate. That is, pictures don't look as smooth, shapes don't look as round and models weren't positioned as accurately. That's really the point of FP in games. If all you wanted was speed, integer would be much better.

    "We are Microsoft, resistance is futile." - Bill Gates, 2015.
  29. Ironically, integer arithmetic is still slower on modern processors. If you were to do a 3d pipeline in integer you would have to shift adjust each calculation, which causes a non-parallel dependency. Thusly, each calculation would cost you at least 3 ticks, whereas floating-point calculations can enter the sub tick range.

    Complicated proofs are proofs of confusion.
  30. That would be true, but in modern MPU's, integer calculations can take as little as 1 clock while almost all FP operations are pipelined and take up to 45 clocks to complete. Plus modern MPU's usually have greater integer resources (by which, I refer to the P4).

    "We are Microsoft, resistance is futile." - Bill Gates, 2015.
  31. Oh the argument with god. Please believe me when I say floating-point beats integer in every case.

    The 20+ tick floating-point operations you speak of are all <A HREF="http://www.dictionary.com/search?q=transcendental" target="_new">transcendental</A> in nature (i.e. sin (96-192), cos (97-196), sqrt (19-35) etc) which can't be done with integer anyways. Division is also costly but can be avoided with reciprocal multiplication. Even a Pentium class processor can achieve floating point multiply and addition operations at 1 per tick when proper pipelining is used. Pentium class processors cannot do this with integer operations. With modern Athlon class processors, you can get close to 1 tick an integer multiply. However, the latency for an integer operation to complete is greater than a floating-point operation. (integer 4-9 vs floating-point 4). Athlon class processors can achieve close to 2 fp mults/adds a tick due to its 3 floating-point units and the Pentium 4 can do even better with SSE2.

    Some comparisons of Athlon integer vs floating point latencies.

    imul integer signed 5-9 unsigned 4-8 the lower numbers are reg to reg the higher numbers are mem to mem.
    fmul 4 single/double precision.

    idiv word 26-27 dword 43-44 depending addressing mode
    fdiv 16 single 20 double precision

    add register to register 1
    add mem to mem 4
    fadd 4 single/double precision.

    So on an Athlon processor (and any other superscalar processor) floating point beats integer in every case.

    Add to this the fact; if you are using integer arithmetic, you have to shift adjust your product after multiplication. This causes a dependant operation that prevents pipelining.

    Complicated proofs are proofs of confusion.
  32. Not exactly sure on the Athlon but the Pentium 3 optimization guide states that average imul latency is 4 cycles vs average fmul latency of 5, with a throughput of 1/2 in the case of imul and 1/9 in the case of fmul:

    http://gcc.gnu.org/ml/gcc/2001-11/msg00205.html

    I'll look up some numbers on the Athlon as well.

    "We are Microsoft, resistance is futile." - Bill Gates, 2015.
  33. Please read
    <A HREF="http://developer.intel.com/design/pentium4/manuals/248966.htm" target="_new">Intel® Pentium® 4 Processor Optimization Reference Manual</A>
    and
    <A HREF="http://developer.intel.com/design/pentiumii/manuals/245127.htm" target="_new">Intel® Architecture Optimization Reference Manual</A>

    Sadly Intel shows no timings for the PIII other that MMX, but here are some comparisons for the P4.

    add latency 0.5 throughput 0.5
    fadd latency 5 throughput 1

    imul latency 14-18 throughput 3-5
    fmul latency 7 throughput 2

    idiv latency 56-70 throughput 23
    fdiv latency 23-58 throughput 23-58

    So the P4 kicks ass in adding but looses in multiplying, which is much more important to graphics. Floating point still wins hands down.

    I know for a fact that the PIII is no magic monster. I've seen the docs in the past before they moved the timings to the P4 and they were equal or worse to the Athlon.

    Just so you understand fixed point arithmetic.

    0x00000fff x 0x00000fff = 0x00ffe001 >> 8 = 0x0000ffe0

    You multiply then you shift.

    Complicated proofs are proofs of confusion.
Ask a new question

Read More

CPUs Font Processors