tyan s2882 lockups

G

Guest

Guest
Archived from groups: alt.comp.periphs.mainboard.tyan (More info?)

I upgraded one of my main servers to the Tyan S2882 and Opteron 246's
a few months back. Since then, I have noticed semi-random system
lockups running Red Hat Enterprise Linux WS release 3 (Taroon Update
3) with the 2.4.21-20.ELsmp kernel. The lockups seem to occur around
the time of high file system IO ie) backups / tar's / etc. The
symptoms are -- I can ping the machine, I can even portscan the
machine and see the active tcp ports -- but, I can't ssh (or anything
else into the machine) and the console is completely locked. The
machine "was" running an Adaptec SCSI PCI-X card out the back until I
thought the problem to be that... since then I have removed the card
and replaced the RAID which it ran with two internal SATA 400 gig
drives (sata_sil). The lockups still happen with high IO on the
filesystem. Strange thing is, I get NOTHING in syslog!!! Nothing at
all. In fact, when I hard-power-cycle the machine when hung, after
the machine is up and I run `last`, I see the reboot I did. Its like
the machine is in some funky psuedo up state... I've tried new memory
with no success either.

I'm pulling my hair out at this point trying to figure out my next
step. Any help or ideas would be great.

diatonic_sequences@hotmail.com
 
G

Guest

Guest
Archived from groups: alt.comp.periphs.mainboard.tyan (More info?)

Previously Diatonic <diatonic_sequences@hotmail.com> wrote:
> I upgraded one of my main servers to the Tyan S2882 and Opteron 246's
> a few months back. Since then, I have noticed semi-random system
> lockups running Red Hat Enterprise Linux WS release 3 (Taroon Update
> 3) with the 2.4.21-20.ELsmp kernel. The lockups seem to occur around
> the time of high file system IO ie) backups / tar's / etc. The
> symptoms are -- I can ping the machine, I can even portscan the
> machine and see the active tcp ports -- but, I can't ssh (or anything
> else into the machine) and the console is completely locked. The
> machine "was" running an Adaptec SCSI PCI-X card out the back until I
> thought the problem to be that... since then I have removed the card
> and replaced the RAID which it ran with two internal SATA 400 gig
> drives (sata_sil). The lockups still happen with high IO on the
> filesystem. Strange thing is, I get NOTHING in syslog!!! Nothing at
> all. In fact, when I hard-power-cycle the machine when hung, after
> the machine is up and I run `last`, I see the reboot I did. Its like
> the machine is in some funky psuedo up state... I've tried new memory
> with no success either.

> I'm pulling my hair out at this point trying to figure out my next
> step. Any help or ideas would be great.

Try a current kernel. Especially SATA support is pretty sparse
in its error reporting in older kernels. 2.4.21 is almost historic
by now.

If you are lucky the problem goes away. if you are less lucky you
at least get an error message.

Arno
--
For email address: lastname AT tik DOT ee DOT ethz DOT ch
GnuPG: ID:1E25338F FP:0C30 5782 9D93 F785 E79C 0296 797F 6B50 1E25 338F
"The more corrupt the state, the more numerous the laws" - Tacitus
 
G

Guest

Guest
Archived from groups: alt.comp.periphs.mainboard.tyan (More info?)

At one point when the SCSI was still in the machine I did try
something in the 2.6 kernel range with the same lock ups. I will try
again with SATA. Any version recommendations?

Thanks


Arno Wagner <me@privacy.net> wrote in message news:<30res9F3458l7U1@uni-berlin.de>...
> Previously Diatonic <diatonic_sequences@hotmail.com> wrote:
> > I upgraded one of my main servers to the Tyan S2882 and Opteron 246's
> > a few months back. Since then, I have noticed semi-random system
> > lockups running Red Hat Enterprise Linux WS release 3 (Taroon Update
> > 3) with the 2.4.21-20.ELsmp kernel. The lockups seem to occur around
> > the time of high file system IO ie) backups / tar's / etc. The
> > symptoms are -- I can ping the machine, I can even portscan the
> > machine and see the active tcp ports -- but, I can't ssh (or anything
> > else into the machine) and the console is completely locked. The
> > machine "was" running an Adaptec SCSI PCI-X card out the back until I
> > thought the problem to be that... since then I have removed the card
> > and replaced the RAID which it ran with two internal SATA 400 gig
> > drives (sata_sil). The lockups still happen with high IO on the
> > filesystem. Strange thing is, I get NOTHING in syslog!!! Nothing at
> > all. In fact, when I hard-power-cycle the machine when hung, after
> > the machine is up and I run `last`, I see the reboot I did. Its like
> > the machine is in some funky psuedo up state... I've tried new memory
> > with no success either.
>
> > I'm pulling my hair out at this point trying to figure out my next
> > step. Any help or ideas would be great.
>
> Try a current kernel. Especially SATA support is pretty sparse
> in its error reporting in older kernels. 2.4.21 is almost historic
> by now.
>
> If you are lucky the problem goes away. if you are less lucky you
> at least get an error message.
>
> Arno