Sign in with
Sign up | Sign in
Your question
Closed

Tyan S4985 G3NR-SI + M4985-SI

Last response: in Motherboards
Share
May 31, 2012 12:47:34 PM

Hello,

We have a workstation with S4985 G3NR-SI. It runs 48 opteron cores with 96gb of RAM. It is rebooting randomly when doing intensive processes. I've isolated the problem to the M4985-SI ad-on board, since the system runs fine without it. I've also tested the memory with memtest86+ 4.00 and it comes out fine.

What other test can I run to be sure where the problem is? The M4985-SI is not cheap and I would like to be certain before buying a replacement board. How can I test for CPU problems and ID which CPU is failing?


System is running OpenSuse 11.4

Thanks for the help!

More about : tyan s4985 g3nr m4985

a c 1039 V Motherboard
May 31, 2012 10:51:48 PM

So you are talking the CPU add on board for 8 socket solution? Have you considered that it might be a power issue with it in?
June 1, 2012 12:42:09 PM

Yes, it's the 8 socket solution. The workstation has 4 redundant 1400w P/S.
Related resources
a c 1039 V Motherboard
June 1, 2012 12:50:52 PM

So we can say no power issue, then the card looks to be the problem.
June 1, 2012 1:42:23 PM

I'm trying to be certain if possible, since it's a pricey part. Wish it was memory, but yesterday I swapped memory between main board and ad-on board. I tested with main board only and memtest86 ran for 24 hours without errors. So more and more it seems to be the ad-on board issue.

Should I swap/test cpu's? I believe cpu problems would be more obvious and detectable.

Best solution

a c 1039 V Motherboard
June 1, 2012 4:07:31 PM
Share

It is the only test you have not tried so I would say so.
June 11, 2012 1:04:57 PM

I downloaded a stress test program called y-cruncher. It can run memory stress routines calculating huge pi numbers. I ran the y-cruncher stress test mode and was able to duplicate the problem. The system crashed and rebooted. This seems to point or confirm that the problem is faulty memory on the M4985-SI ad-on board.

A few day ago I entered the BIOS setup and lowered the memory speed from 667mhz to 533mhz and also disable memory cache buffers. This lowered the system performance somewhat, but made it more stable. I was able to run the y-cruncher stress test and the system held up.

So I will be replacing the faulty memory soon.

June 18, 2012 12:16:37 PM

Best answer selected by harry_pr.
a c 328 V Motherboard
June 18, 2012 7:40:59 PM

This topic has been closed by Nikorr
!