MadCat

Distinguished
Jan 6, 2001
230
0
18,680
Found a few good links about MMX/SSE/SSE2 technology:

<A HREF="http://space.virgilio.it/insignis@tin.it/tommesani/Docs.html" target="_new">http://space.virgilio.it/insignis@tin.it/tommesani/Docs.html</A>
<A HREF="http://www.abo.fi/~mats/codeopt/MMX.pdf" target="_new">http://www.abo.fi/~mats/codeopt/MMX.pdf</A>
<A HREF="http://www.intel.com/design/Pentium4/manuals/24547004.pdf" target="_new">http://www.intel.com/design/Pentium4/manuals/24547004.pdf</A>
<A HREF="http://www.psc.edu/general/software/packages/ieee/ieee.html" target="_new">http://www.psc.edu/general/software/packages/ieee/ieee.html</A>
<P ID="edit"><FONT SIZE=-1><EM>Edited by MadCat on 01/17/02 07:19 PM.</EM></FONT></P>
 

eden

Champion
All right I'm not a smart-ass on this at all, but floating through it I read about the SSE 2 having Floating Point. Now wouldn't that be the reason why SSE 2 is not so powerful on P4s until higher speeds above the AXPs, due to low FPU power?
I know nothing of this but I'd like to be corrected and learn more!

--
The other day I heard an explosion from the other side of town.... It was a 486 booting up...
 

MadCat

Distinguished
Jan 6, 2001
230
0
18,680
"... Now wouldn't that be the reason why SSE 2 is not so powerful on P4s until higher speeds above the AXPs, due to low FPU power? ..."

No, the P4 x86 FPU is separate from the SSE2 FPU. The SSE2 FPU can provide a 2x speedup (two 64-bit results in parallel) or 4x speedup (four 32-bit results in parallel) relative to the P4 x86 FPU (one 80-bit result at a time) when floating point calculations can be vectorized and reduced precision can be justified. The speed of the FPU is implementation dependent.

Someone correct me if I'm wrong. <P ID="edit"><FONT SIZE=-1><EM>Edited by MadCat on 01/12/02 02:59 PM.</EM></FONT></P>
 

LoveGuRu

Distinguished
Sep 21, 2001
612
0
18,980
am i just too tired or is that post in jap?
gota get some sleep..=/

<font color=green>
*******
*K.I.S.S*
*(k)eep (I)t (S)imple (S)tupid*
*******
</font color=green>
 

LoveGuRu

Distinguished
Sep 21, 2001
612
0
18,980
o nothing to fix i just stated i was tired..got back from base (army base) and didnt get much sleep since then..
ill get back to you on that when i do:)

<font color=green>
*******
*K.I.S.S*
*(k)eep (I)t (S)imple (S)tupid*
*******
</font color=green>
 

MadCat

Distinguished
Jan 6, 2001
230
0
18,680
Something I've noted recently - Generally in C, the size of float is 32 bits, and the size of double is 64 bits. The SSE2 specification provides support for both 32 bit and 64 bit floating point math. The Athlon XP does provide 32 bit SSE support for 4x speedup potential. Would hope that vectorizing compilers would take advantage of 32 bit floating point support for when SSE support exists but not SSE2. I've read somewhere that the extended precision for programs that utilize the x86 FPU will usually be wasted as the results will be stored in 32 bit or 64 bit format (C for example).
 

lhgpoobaa

Illustrious
Dec 31, 2007
14,462
1
40,780
comming from a scientific background, i prefer FPU over SSE/SSE2 as floating point uses a 80bit register. and that means higher accuracy.



The lack of thermal protection on Athlon's is cunning way to stop morons from using AMD. :)