Northbridge/Capacater bad?

salpta

Distinguished
Oct 3, 2009
3
0
18,510
I've a real stumper here that I've been spending 2 days trying to TS. Please help confirm my diagnosis before I have to tell the bloke he needs a new mainboard.

System:
Emachines T5230 -> http://www.emachines.com/products/products.html?prod=T5230 With a nVidia G9600GT in it, otherwise stock.

Symptoms:
Systems periodically (15sec to 2min) will experience a 5-60 second "Hang". Some hangs are significant enough to cause a reboot.

What I've done:
So far I've ran a contact thermometer around the inside to check various temperatures with nothing seriously out of whack. (Northbridge HS, Side of the CPU HS, Back of the graphics card, HDD case, a couple of the RAM modules.) Northbridge was ~50dC but problem still occurred when I put an old P2 Fan on it.

So I wanted to see if this was a hardware issue or a software (Windows) issue, so yesterday I installed ubuntu 9.04 on the machine. The install took *3 HOURS* to complete. The problem persisted.

The system had failed about a year ago, so the owner's "Friend" reinstalled windows, and did it poorly (read: He used what I believe is a bootleg Vista CD since the owner never could validate). So today I reinstalled Windows from the OEM CD to see how a fresh install would behave, and the problem persists through a fresh install of Vista. It took *6 HOURS*.

All Power rails are within specs, and I have reset the bios to the manufacturers defaults so I assume the memory timings are correct.

My hypothesis:
I assume the CPU is good since the system can boot, and watching the linux system monitor I see activity on both cores. (during a "Hang" one or the other of the cores will max out). The HDD's SMART doesn't report errors, so I've thrown that out. Memtest86+ did not see any errors. This just leaves me with a Ghost in the Machine somewhere on the mainboard itself. I've visually inspected the MB and can see no busted caps or other obvious signs of damage.

Am I on the right track? Please, I've had to walk away from this one with my tail between my legs for two days in a row now. =(
 
I would go into bios setup and snoop around anyway. See what the memory timing are, how the drives are setup, etc... This machine comes with 667 memory, and the default may be 800 (yeah, I know it shouldn't be). I also agree that PIO instead of DMA is at work here given the HD loading times. E-machines are really famous for using poor quality power supplies as well.
 
Sorry for the delayed response, had other work taking priority over this box.

To answer the questions:
I re-checked the power rails and they're all putting out the appropriate voltages. I can't find my resistor bundle, but when I do I'll check amperage. I really don't expect anything here as it seems this isn't the manufacturer's PSU, unless eMachines has taken to putting 600W Xilence PSU's in their boxes. I missed this the first time through.

Here is some HD diagnostics from linux (the windows side became totally unusable).
matt@matt-desktop:~$ mount
/dev/sda5 on / type ext4 (rw,relatime,errors=remount-ro)
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
proc on /proc type proc (rw,noexec,nosuid,nodev)
sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
varrun on /var/run type tmpfs (rw,nosuid,mode=0755)
varlock on /var/lock type tmpfs (rw,noexec,nosuid,nodev,mode=1777)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=620)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
lrm on /lib/modules/2.6.28-11-generic/volatile type tmpfs (rw,mode=755)
/dev/sda2 on /windows/c type fuseblk
(rw,nosuid,nodev,allow_other,default_permissions,blksize=4096)
/dev/sda1 on /windows/recovery type fuseblk
(rw,nosuid,nodev,allow_other,default_permissions,blksize=4096)
securityfs on /sys/kernel/security type securityfs (rw)

matt@matt-desktop:~$ sudo hdparm -Tt /dev/sda
[sudo] password for matt:

/dev/sda:
Timing cached reads: 1040 MB in 2.00 seconds = 519.55 MB/sec
Timing buffered disk reads: 170 MB in 3.02 seconds = 56.23 MB/sec

matt@matt-desktop:~$ sudo hdparm -i /dev/sda

/dev/sda:

Model=WDC WD2500JS-22NCB1 , FwRev=10.02E02,
SerialNo= WD-WCANKJ361518
Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=50
BuffType=unknown, BuffSize=8192kB, MaxMultSect=16, MultSect=?16?
CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=488397168
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio3 pio4
DMA modes: mdma0 mdma1 mdma2
UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6
AdvancedPM=no WriteCache=enabled
Drive conforms to: Unspecified: ATA/ATAPI-1,2,3,4,5,6,7

* signifies the current active mode
The reads are really low, easily 1/5th of what they should be, but the box is in udma mode. The BIOS doesn't allow for hand-setting memory timings. Any other suggestions?