Archived from groups: comp.sys.hp.hardware,linux.redhat (More info?)
Hi all
I have a stability problem with one of the servers at work, and I really
hope some of you have any ideas as to what may be wrong.
The symptom: In about 1 minutes time, the load goes from less than 1 to more
than 50, and the system becomes unavailable.
I saw this because I had top running at the terminal, and it was still
running, when I got to the server room, showing a load of 140 and 300
processes running (normal is about 120). It did not look like one process
were looping or anything, the top processes were "normal" processes with low
%CPU.
The machine responds to ping requests, but will not display welcome messages
on other ports when I telnet in. On the terminal it will respond to 'date'
but wil hang forever on commands such as 'tail /var/log/mesasages' or
'shutdown -r now'.
The machine is a HP DL380 G3 with 2x3GHz CPU, 2GB RAM, 6x36GB disks on Smart
Array 641 controller in RAID5 configuration with hotspare. We are running
qmail with qmail-scanner, spamassassin etc.
We are running RHEL 3 ES (Redhat Enterprise Linux 3 ES). We had serious HD
performance problems with stock kernels as well as 2.4.21-9 and 2.4.21-15.
Therefore we compiled a new kernel 2.6.7 from kernel.org. The kernel had
built-in support for Smart Array controllers and booted fine, with 4x HD
speed. We are using the TG3 network drivers.
You are about to answer a thread that has been inactive for more than 6 months. If you still wish to proceed, please ensure that your posting is original and does not duplicate or overlap any prior responses to this thread.