System unstable

igormorgado

Distinguished
Nov 17, 2008
8
0
18,510
Hi Everyone,

I dont know where else to go, I dont even know if this is the right place to post this kind of question, but here goes.

I have a home build system (isnt my first one, btw, all my computers are home build except my firsts ones, trs, apple 2, msx and amiga 500), no im not new in this world.

I run debian gnu/linux most of times, but windows xp for gaming.

Today I have a "good" setup fist my setup:

MoBo: Asus P5N-E Sli
CPU: Intel Core 2 Quad Q6600
RAM: 3x1GB DDR2 UDIMM Kingston PC2-6400
Video: Nvidia 8600gt and Nvidia 8500gt (not in sli mode).
Network: Dlink Wireless g510
Power source: 550w extreme
My Disks: ST3320620AS and ST3500320AS

My problem:

My system im completely unstable, sometimes I run it for days (3/4) without any problem (i dont even turn off computer sometimes), but sometimes (very often), it hangs, on linux I can notice some SATA error, on windows sometimes it reports BSOD KERNEL_IOPAGE_SWITCH errors, when a error occurs it becomes frequent, and I have to turn off and try again later (next day), and the system starts ups and works some more time, until it happens again, and again.

My tries:

My first idea was:

Filesystem, problem: tried chkdsk and/or fsck, nothing reported.

Then I tried:

Overheat:

Started cpu/gpu heat monitor, my temp never goes up 50c, in normal usage.

Bad air flow:

I have installed some fans to help remove hot air in back of my case.

BAD cooling:

I have tried stress tests with CPUBURN, made it runs for 4 hours, with a ambient temp 27c my CPU didnt have reached 70c, my max core temp was 68c.

Memory:

I have run memtest86+ for 1 day with 0 errors.
Run burnMMX for 4 hours with 0 errors.

Disks:

Smartctl and speedfan, reports my disk is damn ok, no errors at all.

Sata controller (I dont know how to test it):

As I said previously, some times linux reports things like that

kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2 frozen
kernel: ata6.00: cmd a0/00:00:00:00:20/00:00:00:00:00/a0 tag 0 cdb 0x0 data 0
kernel: res 40/00:03:00:00:20/00:00:00:00:00/a0 Emask 0x4 (timeout)
kernel: ata6: soft resetting port
kernel: ata6.00: configured for UDMA/33
kernel: ata6: EH complete

the weird this is, this is my SECOND board in 6 months with this SAME problem.

My previous one was a MSI P6N SLIv2.


Best Regards,
 

igormorgado

Distinguished
Nov 17, 2008
8
0
18,510
CPUz Reported 11.97 is fair enough, or it isnt?

Can you give some test sugestions to find the problem too?

Hardware monitor
-----------------------------------------------------

ITE IT87 hardware monitor

Voltage sensor 0 1.34 Volts [0x4F] (CPU VCORE)
Voltage sensor 1 3.26 Volts [0xCC] (+3.3V)
Voltage sensor 3 4.87 Volts [0xB5] (+5V)
Voltage sensor 4 11.97 Volts [0xBB] (+12V)
Voltage sensor 7 4.89 Volts [0xB6] (+5V VCCH)
Voltage sensor 8 3.12 Volts [0xC3] (VBAT)
Temperature sensor 0 44°C (111°F) [0x2C] (TMPIN0)
Temperature sensor 1 43°C (109°F) [0x2B] (TMPIN1)
Temperature sensor 2 25°C (76°F) [0x19] (TMPIN2)
Fan sensor 0 2606 RPM [0x103] (FANIN0)
 

igormorgado

Distinguished
Nov 17, 2008
8
0
18,510


AFAIK, not Dual channel at all. I have lost the 4th stick. I bought other one but it wont works fine. Then I leave 3gb =D
 
Your hardware monitor is not accurate for voltage testing and does not measure V droop, or ripple.

About 25% of all computer problems come from failed or poor quality PSUs.
When all tests are good and yet you cannot resolve the problem, that percentage jumps up to around 80%.

When you run memtest you are using a very small amount of power and not stressing the PSU as you do in the OS... hence the stability during that test.
 

igormorgado

Distinguished
Nov 17, 2008
8
0
18,510


That is true, hence, I have run burnMMX (memory tests with cpu stress).

Have tried linux kernel compilation loop too, with sucess.

I will try to buy a new PSU. (my wife will kill me =)
 
Yeah sorry, but it's my best guess and you have tested everything you can.

If it's not that it's certainly the MB, but given the no-brand nature of your current PSU it's very likely to be that.

I suggest an Antec Earthwatts if you are on a budget, at least the 500W. The EA650W is good too.

Any Corsair PSU will be a safe bet, as would PC Power and Cooling, and Seasonic.

Explain to your wife that by buying a high quality efficient PSU you will actually save money within a year on the power bill.
 

igormorgado

Distinguished
Nov 17, 2008
8
0
18,510
My bad, isnt Extreme is Extream. Is a brazilian brand, btw isnt that poor quality chinese PSU, one of those costs here 15 US$ mine have cost something near 150 US$.

Here in brazil they are know as Good Quality, havent seen any bad review and as I was talking, im doing cpu stress tests here and no change in power supply is noticed, I will send the logs after 1 hour.

 
From what I can tell, Extream uses Super Flower OEM PSUs. These are generally considered to be poor quality.

However, they only seem to be sold in Spanish speaking countries, and my Spanish is pretty bad :)

Really, a manufacturers claimed specs mean very little. You need to look at quality reviews done by experts with the right equipment.

I know that XION uses Super Flower OEM, and XION is considered a tier 5 unit:
http://www.eggxpert.com/forums/thread/323050.aspx

A shaky connection, I know, but the best I can do.

The units I listed are all very high quality. I sincerely wish I did not have to buy PSUs made in China, but that's where the good ones come from.

Well, maybe not Zippy... do you have Zippy in Brazil?
 
Your core temps are a bit high. Not quite in the dangerous range though. If your room is hot, I would say OK, otherwise you might need a bettter CPU cooler.

You really can not get accurate voltages from any application. I note that you have some variable voltage on the 12V line... it's hard to say really from that.