Sign in with
Sign up | Sign in
Your question

Athlon XP 1800 maths problems

Last response: in CPUs
Share
Anonymous
a b à CPUs
January 15, 2002 8:28:07 PM

I have recently run a program called lucifer (version 1.0) against an Athlon XP 1800 system, with an EPOX 8KHA+ board and have been getting strange results. (operating system Linux with a 2.4 kernel) (Suse 7.3)

Ocasionally, I get errors in the multiplication check. (need to let it run for a few hours). I have run the same
program on Suse 7.3 (same kernel, etc) on a P3 System, and I do not get these problems...

Any ideas? Processor problem?

Below is an extract of the code...
As I said, it runs flawlessly on a P3 system.

Andrew

---



double a, b;
long double ans;
// get some random floating point numbers
a = (double)rand(); /// (double)(rand() + 1) + (double)rand();
b = (double)rand(); /// (double)(rand() + 1); //+ (double)rand();
// get some new, smaller, random floating point numbers
a = (double)((float)a);
b = (double)((float)b);

// multiply
ans = (long double)(a * b);

if ((ans / b) != a){
error++;
printf("Error! Multiplied %f with %f and got %Lg\n",a,b,ans);
}




<P ID="edit"><FONT SIZE=-1><EM>Edited by andrew001 on 01/15/02 05:31 PM.</EM></FONT></P>
January 15, 2002 9:38:38 PM

Could you give us some sample output from your application on the Athlon processor? I would like to see what numbers it is coming up with that are incorrect.

-Raystonn


= The views stated herein are my personal views, and not necessarily the views of my employer. =
January 15, 2002 10:39:00 PM

You know Raystonn, I'm beginning to worry about you. You skip several threads discussiong various subjects, and post in the one where it looks like there might be a problem with an Athlon.

<font color=orange>Quarter</font color=orange> <font color=blue>Pounder</font color=blue> <font color=orange>Inside</font color=orange>
Related resources
January 15, 2002 10:43:35 PM

Quote:
You know Raystonn, I'm beginning to worry about you. You skip several threads discussiong various subjects, and post in the one where it looks like there might be a problem with an Athlon.

I am curious here what the issue could be. I do not read every thread on this board. In particular, I usually skip those without good subject-lines and with flamebait subject-lines. To which threads were you referring?

-Raystonn


= The views stated herein are my personal views, and not necessarily the views of my employer. =
January 15, 2002 10:43:50 PM

::nods:: burgers right.



"The Cash Left In My Pocket,The BEST Benchmark"
No Overclock+stock hsf=GOOD!
January 15, 2002 10:48:34 PM

Do you have anything to actually add to this discussion or are you simply replying to every post that tries to put me in a bad light with a "I agree!" comment? Is there something wrong with the original poster's question that I should not have replied? Seriously, if you want to flame me, go right ahead. But do it in a dedicated thread so I do not have to waste time reading it.

-Raystonn


= The views stated herein are my personal views, and not necessarily the views of my employer. =
January 15, 2002 10:48:36 PM

Probably an overclocked processor, an overheating processor or bad RAM, I'm not sure which. The same thing happens in Prime95 when my processor or RAM is too highly overclocked. This will happen on any system, Intel or AMD.

AMD technology + Intel technology = Intel/AMD Pentathlon IV; the <b>ULTIMATE</b> PC processor
January 15, 2002 10:56:23 PM

honestly you can't expect someone to view all threads impartially. Personally I now skip most of the 'my athlon is dead' threads because I have seen it too much on this board and am tired of giving the same advice over and over...

<i>The devil's advocate</i>
January 15, 2002 11:02:01 PM

I just think it's interesting that you've left my thread about Intel's datasheet alone for so long. I was expecting some insight from you.
And some other threads, but I can't remember the exact subjects, so I'll point them out later.

I don't know, it just strikes me as odd that you've left certain threads alone. Could just be me.

<font color=orange>Quarter</font color=orange> <font color=blue>Pounder</font color=blue> <font color=orange>Inside</font color=orange>
January 15, 2002 11:08:31 PM

hehe, advice....

well, if it'll make you feel any better, my next upgrade will be a Northwood, but I'm waiting for the 3GHz version, maybe later this year.

AMD technology + Intel technology = Intel/AMD Pentathlon IV; the <b>ULTIMATE</b> PC processor
January 15, 2002 11:16:28 PM

Quote:
I just think it's interesting that you've left my thread about Intel's datasheet alone for so long. I was expecting some insight from you.

I do not see any threads with a subject-line mentioning an Intel datasheet. If I had, I would have read it.


Quote:
I don't know, it just strikes me as odd that you've left certain threads alone. Could just be me.

Well, as I said, I always leave alone any thread that has a flame-type subject or a non-descriptive subject. For example, there was one named "OMFG!!! LOL - Read This!", another named "$250 question.......", and I seem to recall one named either "Wtf" or "What the" or something. I usually skip those as I have limited time and cannot read through everything. I am partial to those with good descriptions.

Of course the flame posts are definately skipped, as I do not participate in those. Some examples of flame-subjects would be "Intel a fragile POS?", "AMD is for POOR people", etc. I simply do not have the time to read through threads that have a greater-than-average chance of being entirely useless.

-Raystonn




= The views stated herein are my personal views, and not necessarily the views of my employer. =
January 15, 2002 11:19:56 PM

Quote:

Well, as I said, I always leave alone any thread that has a flame-type subject or a non-descriptive subject. For example, there was one named "OMFG!!! LOL - Read This!", another named "$250 question.......", and I seem to recall one named either "Wtf" or "What the" or something. I usually skip those as I have limited time and cannot read through everything. I am partial to those with good descriptions.

Of course the flame posts are definately skipped, as I do not participate in those. Some examples of flame-subjects would be "Intel a fragile POS?", "AMD is for POOR people", etc. I simply do not have the time to read through threads that have a greater-than-average chance of being entirely useless.

Amen!

I salute you for your maturity!

AMD technology + Intel technology = Intel/AMD Pentathlon IV; the <b>ULTIMATE</b> PC processor
January 16, 2002 2:12:56 AM

I don't have an answer. Been looking for a pair of numbers that would cause your test to fail but I can't find any. I don't have Linux either.

1) Do you use the same starting seed for rand() when running the test case on both platforms?

2) Do you run the test case on the P3 for a long enough period of time? The Athlon 1800+ executes code faster than the Pentium 3.

3) Is it always the case that the relation ((a * b) / a) == b will hold true where (a*b) is a long double result, and both a and b is double?

When you find a pair of random numbers (print them out) that causes the example to fail on the Athlon 1800+ platform, substitute those numbers for the calculations on the P3 (Don't start from the beginning on the P3, takes to much time).
January 16, 2002 2:25:45 AM

heh, well I'd get an amd setup if there were any that were proven and stable, and if they had any boards that were as solid as the intel 850 based ones. I care more about the board than the processor (of course thermal protection is a must also)

<i>The devil's advocate</i>
January 16, 2002 3:19:45 AM

Hmmm ok, I just cut-and-pasted the snippet code, put it into a big working program, and tested it here. I've tested it on a T-bird 1.33, a dual PII-450, a P3-750, and a dual PentiumPRO-200. None of them produced any errors at all.

What it looks like this code does is it tests the accuracy of a floating-point unit when converting types and reversing operations. Below you will find the code I constructed from this bit.

<pre>#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc, char **argv)
{
double a; /* The first term, a 64-bit FP. */
double b; /* The second term, a 64-bit FP. */
long double ans; /* The result, an 80-bit FP. */
int errors=0;
int i;
int iterations;
char *ctmp;

/* Seed the random number generator. */
srand((unsigned int)(time(NULL)));

/* Get iterations first. */
switch(argc) {
case 1:
iterations=500;
break;

case 2:
iterations=strtol(argv[1], &ctmp, 0);

if (*(argv[1])=='\0' || *ctmp!='\0') {
fprintf(stderr, "FATAL: invalid parameter '%s'\n", argv[1]);
fprintf(stderr, "USAGE: %s [iterations]\n", *argv);
return 1;
}

break;

default:
fprintf(stderr, "FATAL: too many parameters.\n");
fprintf(stderr, "USAGE: %s [iterations]\n", *argv);
return 1;
}

for (i=0; i<iterations; i++) {
/* get some random floating point numbers */
a=(double)rand(); /* /// (double)(rand()+1) + (double)rand(); */
b=(double)rand(); /* /// (double)(rand()+1); //+ (double)rand(); */

/* get some new, smaller, random floating point numbers */
a=(double)((float)a);
b=(double)((float)b);

/* multiply */
ans = (long double)(a * b);

if ((ans/b)!=a) {
errors++;
fprintf(stderr, "Error! Multiplied %f with %f and got %Lg\n", a, b, ans);
fprintf(stderr, "Error margin is %f.\n", (ans/b)-a);
return 0;
}
}

printf("Performed %d iterations, got %d errors.\n", iterations, errors);
return 0;
}
</pre><p>I compiled and ran it with the following commands on my god box (the T-bird 1.33GHz):

<pre>[ kelledin@Valhalla ~ ] # gcc -O3 -mcpu=i686 -march=i686 -o junk junk.c
[ kelledin@Valhalla ~ ] # ./junk 50000000
Performed 50000000 iterations, got 0 errors.
</pre><p>I got identical results on the other boxen. Interestingly enough, if I add "-ffast-math" and "-ffloat-store" to the compiler optimizations, it still doesn't even generate one error ("-ffast-math" causes GCC to slightly violate ANSI rules for the sake of speed, and "-ffloat-store" causes GCC to keep fp variables in memory and possibly sacrifice enhanced accuracy offered by storing in the FP registers).

I'm using gcc 2.95.3, slightly patched to fix a weak symbol exporting bug. Anything older is generally more buggy, gcc-2.96 is not an official release of gcc, and gcc 3.x is b0rken. Thus, gcc 2.95.3 is the compiler of choice. What compiler are you using?

<i>If a server crashes in a server farm and no one pings it, does it still cost four figures to fix?
January 16, 2002 3:26:45 AM

Hmph, and I just spotted an error in my own code. It would have died noisily if it had spotted even one FP inaccuracy, but that's hopefully the only side-effect of the error.

I made a few changes to reflect what I <i>think</i> the code originally looked like (before you commented out rand() functions?)

The new code is below.

<pre>#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(int argc, char **argv)
{
double a; /* The first term, a 64-bit FP. */
double b; /* The second term, a 64-bit FP. */
long double ans; /* The result, an 80-bit FP. */
int errors=0;
int i;
int iterations;
char *ctmp;

/* Seed the random number generator. */
srand((unsigned int)(time(NULL)));

/* Get iterations first. */
switch(argc) {
case 1:
iterations=500;
break;

case 2:
iterations=strtol(argv[1], &ctmp, 0);

if (*(argv[1])=='\0' || *ctmp!='\0') {
fprintf(stderr, "FATAL: invalid parameter '%s'\n", argv[1]);
fprintf(stderr, "USAGE: %s [iterations]\n", *argv);
return 1;
}

break;

default:
fprintf(stderr, "FATAL: too many parameters.\n");
fprintf(stderr, "USAGE: %s [iterations]\n", *argv);
return 1;
}

for (i=0; i<iterations; i++) {
/* get some random floating point numbers */
a=(double)(rand()+1)+(double)rand();
b=(double)(rand()+1)+(double)rand();

/* get some new, smaller, random floating point numbers */
a=(double)((float)a);
b=(double)((float)b);

/* multiply */
ans = (long double)(a * b);

if ((ans/b)!=a) {
errors++;
fprintf(stderr, "Error! Multiplied %f with %f and got %Lg\n", a, b, ans);
fprintf(stderr, "Error margin is %f.\n", (ans/b)-a);
}
}

printf("Performed %d iterations, got %d errors.\n", iterations, errors);
return 0;
}
</pre><p>Same results on all systems.

<i>If a server crashes in a server farm and no one pings it, does it still cost four figures to fix?
January 16, 2002 3:49:00 AM

AMD cpu's are notorious for being fragile and overly hot, but aside from that you might be on to something! send your findings to AMD(don't accept any hush-hush money!) so they can verify it(don't give them your cpu) do like tom did with the Pentium 1.13

keep us updated, this is very interesting.

"<b>AMD/VIA!</b>...you are <i>still</i> the weakest link, good bye!"
January 16, 2002 4:47:09 AM

ROFL, you Intel fanatics never cease to amuse me.
Quote:
don't accept any hush-hush money

Haha. I sometimes wonder what in your life's experiences has caused you to devote loads of time bashing not only AMD processors but AMD in general while at the same time going out of your way to praise everything under the sun that is Intel. Just a serious question that I know many people ask and never get a forthright response (not that I expect one here anyway).

Hard work often pays off in time, but laziness always pays off now.
Anonymous
a b à CPUs
January 16, 2002 7:13:07 AM

Hi all, sorry about the stress problem seems to be solved...
The processor isn't overclocked, and this time its NOT a memory error!

P3 has shows the same effects, didnt wait long enough for it to appear (its a lot slower than the AMD and can't do as many tests/ hour). Looks like a rounding error. (IE: Either I should ALWAYS get the error message or never, but every now and then for the SAME numbers... wierd!)


Sample Output
P3
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0

AMD XP 1800
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 159020160.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Error! Multiplied 1559592960.000000 with 0.000000 and got 0
Anonymous
a b à CPUs
January 16, 2002 7:28:39 AM

Problem duplicated on P3... seems it is an issue with rounding on very small numbers (the P3 does it as well).
I actually am running the original code, just cut out the lines for the posting that referred to the problem (pretty dumb if I missed a few).

So therefore OBVIOUSLY this is NOT an AMD issue.

I was using this problem to test the box a little, as we had been having some problems with kernel crashes, etc. Originally also discovered, that there were memory problems, and after replacing the memory with GOOD working memory, discovered that lucifer was STILL reporting errors.
-> Thats when I started getting REALLY worried.

But thankfully, it just looks like the implementation of the 'Multiplication TEST' is flawed



Just for the Record...

Unfortunately, I don't have access to the SUSE System with the AMD at the moment, but an AMD XP 1800 RH6.2 System showed exactly the same results, here the output from gcc...

I was using the supplied make files to compile the program.

AMD XP 1800 - Redhat 6.2 (no patches)
> gcc -v
Reading specs from /usr/lib/gcc-lib/i386-redhat-linux/egcs-2.91.66/specs
gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)


P3-500 Suse 7.3 (No patches)
> gcc -v
Reading specs from /usr/lib/gcc-lib/i486-suse-linux/2.95.3/specs
gcc version 2.95.3 20010315 (SuSE)

Thanks for your help...

Andrew
January 16, 2002 10:00:28 AM

amdmeltdown and intelinside.............you see how shaky brain you both have.......you always blame ***AMD*** because of your unstable logical units in your heads.
your attitude was another proof of being uselss people over here.
all users really getting boared from your posts and no one is taking into consideration your silly comments.

the guy found out (P3 has shows the same effects, didnt wait long enough for it to appear (its a lot slower than the AMD and can't do as many tests/ hour). Looks like a rounding error. (IE: Either I should ALWAYS get the error message or never, but every now and then for the SAME numbers... wierd!)

wish if there was UnDo in the life
January 16, 2002 3:34:34 PM

Could you change the %f in the format statements to %e so we can see the actual normalized numbers. When I see 0.00000 it means a very small number but I don't know how small.

I.E. change printf("output %f, var); to printf("output %e", var);
January 16, 2002 4:13:53 PM

Now that I look back at the code it is definitely a mathematical pitfall.

The code

if ((ans / b) != a){

is flawed because ans is a double long of 80 bits and the precision is varied between double long and double.

The compiler loads ans, which has approximately 19 decimals of precision or a 64bit mantissa. Then loads b, which has approximately 15 decimals of precision of a 52bit mantissa. All numbers are stored in the FP unit in 80bit form. The division is made and the results are stored in a long double 80bits. The variable a is then loaded and subtracted from this 80 bit number and there is a roundoff of 12 bits. or 4 decimal places. All IEEE fp compatible processors and software will cause this error.

This should solve your error

if (((double)ans)/b) != a) {
January 16, 2002 5:20:35 PM

I agree with burger, and thats on topic(a reply is on topic if it applies to the post replied to) so dont whine about my oppinion ray.

"The Cash Left In My Pocket,The BEST Benchmark"
No Overclock+stock hsf=GOOD!
!