Very, very strange problem

Hi all, I've had a very strange problem with my computer for a few years now (yeah, I know, probably should have put more effort into fixing it by now). I have scoured the internet for solutions, but simply can't find any. I also consider myself somewhat adept at troubleshooting, but I am at a complete loss. Here is my problem:

My computer, randomly, turns itself off. When it does, it tries to restart itself after a few seconds. I use the word "tries" because it will try to reboot itself even if I have unplugged it (it will come on ever so briefly using the electricity stored in the capacitors). This indicates to me that it is probably trying to reboot, as opposed to shut down, but I could be wrong.

Now, sometimes my computer will run fine for weeks on end. Other times, it will not even make it to the windows start up screen and continually reboot for hours until I unplug it (and take several days of fiddling with it before it is semi-stable again). Generally, when it is in one of its cycles of doom, the power button on my case is completely unresponsive.

It gets wierder. When it is in the worst shape, when I finally bring it back, sometimes the clock multiplier in the bios has changed itself to a lower setting (and it surprisingly gives me an overclocked warning when it does the memtest at the beginning of the boot process).

I think the culprit is most likely the motherboard or the processor. It may also be the RAM, but I highly doubt it (I've done some not completely comprehensive mem tests with no errors). I know for a fact it isn't the PSU (because I replaced it), and something in the back of my mind tells me the power button on the case might be faulty.

I really can't tell, because I have not noticed any patterns related to when it is stable and when it is not. The only one is it is generally more stable when it is cooler, but I'm not even sure if that is entirely true. As for overclocking, I've run with aggresive bios settings in the past, but nothing that really qualifies as overclocking. I have been running with factory settings for a long time though, and that seemed to have improved it a little. What really gets me is why the computer, in general, still works.

Anyway, does anyone have any ideas? This is a real headscratcher if you ask me, but I would be curious if anyone else has ever had this problem. Is there a fix, or a part I should replace?
41 answers Last reply
More about very strange problem
  1. What are your specs?Did you check your temps?
  2. When I see this at work I check

    1) How dirty CPU heatsink is, then I re-grease the heatsink while I'm there
    2) Temporarily hook up a different PSU
    3) Swap out CPU if I have a spare lying around
    4) Change mobo

    I run sse3dnow to check for CPU errors, memtest86+ for RAM issues, and run cpuburn to heat up chip and check for stability under load.
  3. Specs (probably should have posted these to begin with):

    AMD Athlon XP 2600+ running at 2,075 MHz (166MHz FSB x 12.5)
    ASUS A7N8X
    1GB RAM (higher end stuff with factory 5-2-2-2 timing)
    Enermax 485W PSU
    NVIDIA GeForce 7800 GS (yeah, didn't feel like getting a new processor and mobo, as well as a new card, to play Oblivion)

    I think that's all that is relevant, but feel free to ask for more.

    As for the heatsink being dusty, I have one of the early Koolance cases, so it is water cooled.

    Temps are currently at (viewing them in the bios):

    33*C MoBo
    45*C CPU

    My Koolance case has a probe that goes in a groove on the bottom of the water block (i.e. it is right in between the processor die and the heat transfer plate on the cooler) and it reads 30*C at the same time.

    Voltages are (all set to auto):

    1.66V - VCORE
    3.26V - +3V
    4.94V - +5V
    12.22V - +12V

    Some other BIOS settings:

    FSB Spread Spectrum - 0.5%
    ASP Spread Spectrum - Disabled
    Graphics Aperture Size - 256MB
    AGP Frequency - Auto
    System BIOS Cacheable - Disabled
    Video RAM Cacheable - Disabled
    DDR Reference Voltage - 2.6V
    AGP VDDQ Voltage - 1.5V
    AGP 8x Support - Enabled
    AGP Fast Write Capability - Disabled


    If you need anything else, just ask. I may be able to get a processor from work and see if that helps. If not, do you all think I should just get a new MoBo?
  4. Quote:
    I run sse3dnow to check for CPU errors, memtest86+ for RAM issues, and run cpuburn to heat up chip and check for stability under load.


    I am aware of memtest86+, but where can I get the other two? Did a fellow named Roy Longbottom make sse3dnow. That is the site I found after searching, but I wanted to check. The website for cpuburn that I found also doesn't look that great.
  5. Unless the processor is overheating or has been severely over volted you can put it at the tail end of the list of things to check, IMO.

    I would have guessed PSU but you have a new one already and still you have the problem.

    The switching thing is odd, perhaps the power switch is simply shoring itself? You must at least rule this out. Simple way to do this is just unhook it from the board and manually short the pins on the panel to start her up. Try this.

    If not that then probably motherboard is cracked or shorting or has a dying componant somewhere.
  6. Quote:
    I run sse3dnow to check for CPU errors, memtest86+ for RAM issues, and run cpuburn to heat up chip and check for stability under load.


    I am aware of memtest86+, but where can I get the other two? Did a fellow named Roy Longbottom make sse3dnow. That is the site I found after searching, but I wanted to check. The website for cpuburn that I found also doesn't look that great.

    Yes, Longbottom's program. Its great for checking for cpu errors, especially when OC'ing, but even at stock its useful.

    You can use any program you like to run the CPU at 100%. Prime 95, etc. Just the one I happen to use is cpuburn-in

    http://users.bigpond.net.au/cpuburn/

    Its old, but it works. For dual-core, you just run two instances of it. All you are doing is trying to put a load on your PSU and see if your cooling is adequate.

    While you run it, run ASUS pc-probe or a similar program to watch your voltages and temps. You could use a volt-meter to check your PSU while under load as well. These methods aren't totally solid, but its the best you can do if you don't own some very expensive diagnostic equipment.

    EDIT: My money is on the mobo at this point. Is there any airflow across the voltage regulators and chipset? The problem with watercooling is other mainboard components rely on the CPU cooler's fan for some air flow. Have a peek and see if you can see any bad caps as well. Pay special attention if you have gold-striped purple caps near the CPU.
  7. Quote:
    Unless the processor is overheating or has been severely over volted you can put it at the tail end of the list of things to check, IMO.


    I would have thought so too. I am running the sse3dnow tests now and so far no errors.

    PSU was my first guess as well, which is why I replaced it (about two years ago now - like I said, have had this problem for a LONG time).

    Over that time frame, I've given the MoBo several visual inspections and never seen a crack or burst capacitor (see why this is so hard to diagnose, all the obvious tests come up negative).

    The inability to find a solution is also why I think it might be something as stupid as the power button. However, I have been too lazy (too busy at work really to have much time to use my computer and therefore maintenance has not been a high priority) to move it to a new case to test. Should I take a voltmeter to the power button or something, of just unplug the front panel connector and short it manually like you say and see if it is stable (I am thinking I should disconnect the reset button as well).
  8. Hey mate

    Right it sounds like to me you've got a pretty fundamental hardware problem.

    First reset all BIOS settings to factory defaults (but make sure everything is switched on, i.e. USB controllers etc).

    Take down your memory timings to something like 3-3-3-10 or even 4-4-4-12 so we can eliminate your RAM (seeing as how your timings are so tight), though to really do it, put it in another machine and check all is OK. If that machine fails you know its your RAM.

    Try switching out the power supply first, seeing as this is where your problems start (make sure its known good).

    If that fails then its most likely your motherboard or your CPU.

    But anyway let us know how you get on. :D

    hope that help dude
  9. Quote:
    Unless the processor is overheating or has been severely over volted you can put it at the tail end of the list of things to check, IMO.

    I would have guessed PSU but you have a new one already and still you have the problem.

    The switching thing is odd, perhaps the power switch is simply shoring itself? You must at least rule this out. Simple way to do this is just unhook it from the board and manually short the pins on the panel to start her up. Try this.

    If not that then probably motherboard is cracked or shorting or has a dying componant somewhere.



    Actually, notherdude has got it spot on, i should really read an entire forum before posting shizzen.
  10. Am running Longbottom's program now and so far no errors (about half way through). Will try cpuburn next.

    As for cooling, I was aware of those issues when I built the system. I have a big fan in the front, two fans in the back, the two fans in the PSU and the fan on the video card. I think I have enough, but I will run pc-probe when I do the cpuburn test next. I also have the northbridge water cooled, even though it originally came with a tiny heat sink.

    I will have a look for those purple caps next. As I said in another post, I've given the board several visual inspections (including a quick one a couple nights ago) and never noticed anything amiss.
  11. Allright, finished with sse3dnow and got no errors. Will do cpuburn next.

    My question is, as these shut downs are not regular at all and testing a solution will therefore take several weeks, what would you like me to try first after the diagnostics are finished?
  12. Well I was hoping putting your machine under a load would force the problem to occur.

    If it doesn't, and all your tests pass then I would say its the mobo or a switch.

    Find an old ATX case from somewhere. Rip out the power switch and reset, then hook them up to your board and run them out of your case.

    You could just jump the pins with a screw driver, but if it takes such a long time for the problem to crop up this could end up being a huge pain in the butt.

    You won't be able to tell if a switch is bad from a voltmeter if the problem is intermittant. The switch could be working at the moment you test it :)
  13. I've had this problem on one of our PC, it drove us nuts for ages. Would sometimes run stable sometimes not and had the resetting of bios etc.
    Err specs of it are Amd 2200 or there abouts and PC Chips motherboard(cheap and cheerful). We had used memtest etc and that showed no errors, so we assumed all was fine there. Tested it with different graphics cards and all possible bios settings but it was still tempremental.
    Anyway we was building new PC and so was swapping old hardware about between machines. Turns out that board only likes a couple of sticks, all the other identical ones it will carry on as before being tempremental. All the ram sticks we tried in it were all the same speed, make and size and non show errors in memtest on any of the Pc's.Non of the other Pc's had problems with the ram sticks. So my advice is to try other memory sticks if you can or if you have two in it try justone at a time see if it dosn't like one of the sticks.
  14. Ok, so here are the results of the cpuburn test:

    Idle (win xp desktop with typical background items running) temps according to pc probe:

    CPU - 31*C (my Koolance probe says 28)
    MoBo - 37*C

    Voltages:

    12.224-12.288
    4.919-4.945
    3.264-3.28
    1.664-1.68

    1st test: Lasted a few minutes before the system crashed. In that time, I didn't notice any voltage shifts and temperatures increased a degree centigrade max.

    2nd test: I kept a closer eye on the voltages this time and noticed that some of them dropped slightly below the idle range almost instantly. All returned to normal rather quickly, except +5V which was a little lower than normal, with a range of 4.892-4.919 for the entire test. The +3.3V infrequently dropped to 3.248 (including at the beginning). This time the test lasted about 20 minutes before I posted this and is still going. Temps again barely increased (peaked at 32 cpu and 39 mobo).

    I have zero experience with overclocking: Is it indicative of a major problem when voltages drop slightly like that? Should I try the test again with FSB spread spectrum disabled?

    BTW, CPU Burn-in reported no errors the entire time. Also, windows task manager indicated that my CPU usage was at 100% the entire time as well.
  15. Update: A couple of the voltages briefly slid a little more (I am still running the second test without crash). The lows:

    12.224
    4.865
    3.248
    1.648
  16. Quote:
    You won't be able to tell if a switch is bad from a voltmeter if the problem is intermittant. The switch could be working at the moment you test it :)



    Good point :oops:

    I'll try the power switch thing.
  17. Quote:
    So my advice is to try other memory sticks if you can or if you have two in it try justone at a time see if it dosn't like one of the sticks.


    Thanks, I will try this too. Although, these shifty voltages have got me worried, so I'm curious what people have to say about that.
  18. Your voltages appear to be within normal variance. It is rare for them to be exact. I'm no expert on PSUs but I see variances like that all the time.
  19. Alright cool. The 2nd test ended without crashing (I set it to last 1 hour). My MoBo temperature very rarely touched 40*C, but mostly sat at 39*C. My CPU sat at 32*C for most of the test.

    So, conclusions:

    My cooling seems adequate as the temps barely raised at all.
    It doesn't seem like the processor is the problem as none of the tests reported errors (although it did crash once, but again, doesn't seem like the processor caused it).

    The 1st and 2nd test illustrate just how random this is. Initially crashes very quickly, then keeps on chugging like a champion.

    I'm going to leave the PC Probe recorder on in the mean time and play some Oblivion (which I usually play at almost maxed out settings) to see if that stresses it a lot (more RAM, Northbridge and GPU usage than CPU Burn-in).

    So, here are the things I think I need to do next based on everyone's feedback (thanks, by the way!):

    Try a different power switch for a couple weeks and see if that works.
    Try using the pair of RAM sticks individually and see if that is the problem.

    If those don't fix it, is it safe to assume that I need a new MoBo?

    Anyway, I'm going to run memtest86+ again just to be sure, than try the power switch thing first. If you think I should proceed differently, please advise.
  20. It is really hard to diagnose an intermittant problem like that.Try unplugging and repluging any connections to MB,PSU,reseat ram and CPU and disconnecting and hotwiring the power switch.Before you throw it out the window.
  21. Ok, it finally crashed again (was just idle at desktop) and the last recorded temps were:

    CPU - 30*C
    MoBo - 36*C

    Voltages:

    12.288
    4.919
    3.264
    1.664

    So everything seemed normal...
  22. Change the CMOS battery on the board could be a cheap fix
  23. Well, thats not right at all. You should be able to run cpuburn-in all night long.

    The easiest switch would be the ram. Is that 1gig of RAM consist of more than one stick? If so try one stick at a time and see if it crashes on a particular stick.

    If you just have a 1GB stick in there, swap it out. Even if its not 2700 and you have a 2100 stick lying around, try it and see it if crashes.

    What BIOS revision are you running?

    EDIT: Thanks Belinda. I have seen a few machines were memtest would pass for hours but there was still a RAM issue. Some boards just don't like certain brands, single-sided RAM, mixing different brands, etc.
  24. It is two sticks.

    Bios Rev 1001.E

    EDIT: Actually, PC Probe says Bios Rev 1019 Beta 003 T2, release date 11/27/02 (I seem to remember flashing it not too long ago, so that date has to be wrong at least). It says 1001.E at boot.
  25. Quote:
    It is two sticks.

    Bios Rev 1001.E


    Well if your PCB is ver 1.03, 1.04 or 1.06 it is 8 versions out of date.

    BIOS flashes 1002, 1003 and 1007 specifically mention fixing stability using certain types of RAM.

    1009 is the latest. Just makes damn sure your PCB has vr. 1.03/1.04/1.06 silkscreened on it if you intend to run this flash.

    A7N8X and previous listed PCBs ONLY!!! Don't use for Deluxe, -X, -VM, -E Deluxe, etc.

    http://dlsvr01.asus.com/pub/ASUS/mb/socka/nforce2/a7n8x/AN8B1009.zip

    You will need this utility

    http://dlsvr01.asus.com/pub/ASUS/mb/socka/nforce2/a7n8x-deluxe/awdflash.zip

    Here's an image to make a boot disk for flashing

    http://www.ts.nu/Files/drdflash.exe

    Extract this image to a floppy, then extract the awdflash utility and the BIOS image to the disk.

    Set your bios to boot to floppy first and you're ready to rock. (Don't reboot, shutdown, unplug while you flash or you'll have a dead mobo...)

    EDIT:

    After flashing, go into the BIOS and load the setup defaults. Save and reboot. Go into the BIOS again, and then make any changes you need to.
  26. Ok, by board is PCB revision 1.04 (I knew that window was stylish AND functional). Here is what I have installed:

    Version 1001E 2003/01/13 update

    Description A7N8X BIOS 1001E for PCB revision 1.03, 1.04, and 1.06 only.
    Improve memory stability

    Here are my options for more recent BIOSes:

    Beta Version 1009 2005/09/27 update

    Description A7N8X BIOS 1009 for PCB revision 1.03, 1.04, and 1.06 only.
    Support AMD Sempron CPU
    Patch 3D Labs AGP card compability issue

    or:

    Version 1007 2003/10/09 update

    Description A7N8X BIOS 1007 for PCB revision 1.03, 1.04, and 1.06 only.
    Improve system stability with Hynix and PSC memory modules.

    I am thinking I go with 1007 since I don't have a Sempron or 3D Labs card and that way I avoid the Beta. Thoughts?

    EDIT: Thanks VIC20, looks like you did the research as well. Should I go with the most recent, Beta Bios or the one before it?
  27. Try 1007 first. That could solve your issue right there. If it doesn't, then try the beta.
  28. Quote:
    Change the CMOS battery on the board could be a cheap fix


    I was thinking that exactly myself. If your BIOS settings get reset after crash than maybe the CMOS battery is borderline.

    My 2cp
  29. Could be.
  30. In my opinion, the only way to fix an intermittent problem like this is to systematically swap out/interchange parts of the system to establish which parts of your system are *good*. Instead of poking around looking for a needle in a haystack, you have to first establish roughly which part of the haystack the problem resides in. You are going to need a friend's computer to do this.

    One thing noone has mentioned yet, that you should test as well is the power in your house. Voltage spikes caused by faulty appliances/wiring in your house could affect your system. Bring it over to a friend's house and try it there.
  31. Okay guys! Is it normal for the CPU to be running at 100%? It dosen't seem normal to me. What would produce this situation? Or was it just because of the test being run?
  32. Quote:
    Try 1007 first. That could solve your issue right there. If it doesn't, then try the beta.


    Alright, I am going to try this course of action (stopping when the problem does) and will report back on my results:

    Flash to 1007 (and install the latest drivers for good measure).
    Flash to 1009.
    Try a new power switch.
    Try one stick of ram at a time.

    Here's hoping it doesn't crash during a flash!

    If none of that works, I'll swap out the CMOS battery (this isn't high on my list as the clock multiplier getting changed - and it is only that one setting which changes - happens very rarely and only after many, many consecutive crashes).


    Quote:
    One thing noone has mentioned yet, that you should test as well is the power in your house. Voltage spikes caused by faulty appliances/wiring in your house could affect your system. Bring it over to a friend's house and try it there.


    I have a surge protector which should cover me on the high side at least. I don't think this is it though, as I have moved a total of 6 times since this started (and it happened in all 6 places I have lived).

    As for isolating what the problem part is, I think I've pretty much got it down to the MoBo or RAM (or a combination of the two) or potentially the power switch for now. After seeing all those BIOS revisions talking specifically about stability issues with RAM, I am starting to lean in that direction.


    Quote:
    Okay guys! Is it normal for the CPU to be running at 100%? It dosen't seem normal to me. What would produce this situation? Or was it just because of the test being run?


    Yeah, it was the test. CPU Burn-in puts max load on your processor. I think that test helped me establish that the processor and cooling set up are fine.
  33. Ok, so I flashed to 1007 and now my FSB won't run at anything but 100 MHz (no matter what I set it to in the BIOS). I ran memtest86+ and I got no errors. I think this is indicative of the MoBo and RAM not getting along. Thoughts?

    BTW, I feel like a douche for not flashing the BIOS before coming here, so I apologize for that. I could have sworn that I had already done this, and maybe I did and ran into the same problem and flashed back, but I don't remember...
  34. It crashed while I was editing a BMP in MS Paint of all things, so I went ahead and flashed to 1009 Beta, which seems to have fixed the FSB issue. Fingers crossed it doesn't crash.
  35. Well, the BIOS updates hasn't seem to have fixed it as it is still crashing.

    I'll try using single sticks of RAM now.
  36. Quote:

    Flash to 1007 (and install the latest drivers for good measure).
    Flash to 1009.
    Try a new power switch.
    Try one stick of ram at a time.

    Here's hoping it doesn't crash during a flash!

    If none of that works, I'll swap out the CMOS battery (this isn't high on my list as the clock multiplier getting changed - and it is only that one setting which changes - happens very rarely and only after many, many consecutive crashes).

    One thing noone has mentioned yet, that you should test as well is the power in your house. Voltage spikes caused by faulty appliances/wiring in your house could affect your system. Bring it over to a friend's house and try it there.


    I have a surge protector which should cover me on the high side at least. I don't think this is it though, as I have moved a total of 6 times since this started (and it happened in all 6 places I have lived).

    As for isolating what the problem part is, I think I've pretty much got it down to the MoBo or RAM (or a combination of the two) or potentially the power switch for now. After seeing all those BIOS revisions talking specifically about stability issues with RAM, I am starting to lean in that direction.


    surge protectors only keep your system from being fried by a dangerous event. they do not filter the power at all. they usually have little more than a $0.05 varactor or a circuit breaker which smokes open during a huge surge. they must be replaced/reset after each event. however, since you've moved to different houses a lot, this is a good sign -- you are right. wall power problems seem unlikely. the last thing to do here is to make sure it's not the surge protector itself that is the problem. to test this, try a different surge protector, or plug your computer directly into the wall.

    you mentioned in your original post that the problem has been going on "for a few years", yet you do not mention whether or not the computer ever worked properly. and if it did, whether you changed RAM/BIOS since then. if it did work properly initially, and RAM/BIOS did not change before the problem arrived, then i think it is very unlikely that updating the BIOS will fix it. to establish that it does indeed have something to do with RAM i think you need to put other sticks in there, or put your sticks into someone else's mobo.

    the power switch idea is a good one, but as per the initial suggestion, you do not need a new power switch to test whether or not your old one is shorting. you just have to unplug your current switch from the motherboard. front-panel power switches are momentary-on switches so you can leave your case open and short the header briefly (the two motherboard pins that your switch plugs into) manually with a metal object to start the system (be real careful, obviously...). some power supplies also have on/off switches, but it is easier to swap out the whole power supply to test that.

    if that doesn't work, then you have to start swapping stuff. if there is a cracked trace or a short, it could be anywhere in your system (Ram card, PCI card, moboard, drives, case-touching-moboard, etc.) and this is the only way to find it. if some component has degraded and no longer meets spec, this is also the best way to isolate it. in extremely rare cases it could also be a malfunctioning peripheral (monitor, printer, keyboard, mouse) that momentarily causes a power problem. to test this, swap these out, or run your computer without them.
  37. aside from the suggested fixes, i'd just like to note that your AGP aperture size is 256mb. I read somewhere that the optimum is 64 or 128mb... and performance decreases if increased further. It's something about the video card being allowed to access RAM.

    I know when I increased my aperture size before, it made my computer slower and act weird sometimes. But maybe if you already reset and flashed your BIOS, this might not be the problem.
  38. Quote:
    the last thing to do here is to make sure it's not the surge protector itself that is the problem. have you always used the same one?


    Yes, I have used the same one and it is pretty old at this point. If you think it is worth it, I will plug the computer directly into the wall or get a new one. I don't think that is the problem due to all the moves though and it sounds like you agree.


    Quote:
    you mentioned in your original post that the problem has been going on "for a few years", yet you do not mention whether or not the computer ever worked properly. and if it did, whether you changed RAM/BIOS since then.


    Yes, it did work properly initially. Unfortunately, I can't remember exactly when it started any more. I have not changed RAM since I built the system.


    Quote:
    if there is a cracked trace or a short, it could be anywhere in your system (Ram card, PCI card, moboard, drives, case-touching-moboard, etc.) and this is the only way to find it. if some component has degraded and no longer meets spec, this is also the best way to isolate it.


    I am under the impression that if there is a short or cracked trace, it shouldn't boot at all, right? Or, if it might, that it shouldn't run for very long. The reason I have ruled this kind of stuff out is, at times, it will run for weeks without crashing, and for months crashing very infrequently. Other times, it will continuously crash after a very short period, and the only way to fix it is to unplug it and try again the next day.

    For instance, when I flashed to 1009, it started crashing consistently shortly after boot. I turned it off and went and ran some errands. When I came back, I turned on a rugby match which I have been watching now for 2 hours without a crash. It will change character that quickly. Is that consistent with a short?

    Sometimes I wonder if all the moving hasn't caused it, which is why I think it might be the power switch, but I obviously have no idea. I'll try those things and get back to you all.
  39. Quote:
    i'd just like to note that your AGP aperture size is 256mb. I read somewhere that the optimum is 64 or 128mb


    I thought you were supposed to make it the same size as the RAM on the card. Should I change this?
  40. testing the surge protector is easy to do, so it's worth it. just plug your computer into the wall. also test your peripherals by removing them/swapping them out.

    a crack in a trace could, but would not necessarily prevent the system from booting, which is why it's the hardest kind of intermittent problem to track down. a crack could cause exactly the behaviour you describe. metal traces on boards (moboard, graphics card, RAM card, hard drive electronics, basically every part in your system) are stuck to the motherboard plastic/bakelite-type material. if the board is bent or there is a large temperature gradient (large change in temperature over a short distance) the metal can crack resulting in an intermittent "open" -- severing the electrical connection the trace was meant to make (opposite of a "short"), just like a switch. as the temperature of the board changes during normal operation of your computer, the different materials in the board expand/contract at different rates and can cause the cracked trace to open/short back together intermittently. these cracks are really small and usually impossible to see even when on the surface. motherboards have several layers of traces (piled vertically inside the board) so unless it's a huge crack from some physical trauma, there is zero chance of seeing those. another type of crack is caused by a cold solder joint which is basically a manufacturing defect where they do not heat the solder enough.

    the components that plug into boards (chips, power transistors, capacitors) can also age/fail with no externally visible signs. so the best way to locate these things is by swapping stuff. i am not saying you have a crack, or a bad component. i am just saying they are consistent with your observed problem.

    it just occurred to me that a safer way of testing your front-panel power switch is to open your case, turn on your computer like normal using the switch, and then reach in and unplug the switch from the motherboard. if your system is stable this way, replace the switch.

    Quote:
    the last thing to do here is to make sure it's not the surge protector itself that is the problem. have you always used the same one?


    Yes, I have used the same one and it is pretty old at this point. If you think it is worth it, I will plug the computer directly into the wall or get a new one. I don't think that is the problem due to all the moves though and it sounds like you agree.


    Quote:
    you mentioned in your original post that the problem has been going on "for a few years", yet you do not mention whether or not the computer ever worked properly. and if it did, whether you changed RAM/BIOS since then.


    Yes, it did work properly initially. Unfortunately, I can't remember exactly when it started any more. I have not changed RAM since I built the system.


    Quote:
    if there is a cracked trace or a short, it could be anywhere in your system (Ram card, PCI card, moboard, drives, case-touching-moboard, etc.) and this is the only way to find it. if some component has degraded and no longer meets spec, this is also the best way to isolate it.


    I am under the impression that if there is a short or cracked trace, it shouldn't boot at all, right? Or, if it might, that it shouldn't run for very long. The reason I have ruled this kind of stuff out is, at times, it will run for weeks without crashing, and for months crashing very infrequently. Other times, it will continuously crash after a very short period, and the only way to fix it is to unplug it and try again the next day.

    For instance, when I flashed to 1009, it started crashing consistently shortly after boot. I turned it off and went and ran some errands. When I came back, I turned on a rugby match which I have been watching now for 2 hours without a crash. It will change character that quickly. Is that consistent with a short?

    Sometimes I wonder if all the moving hasn't caused it, which is why I think it might be the power switch, but I obviously have no idea. I'll try those things and get back to you all.
  41. It just occured to me that it probably isn't the power switch, but it may be the reset switch. The reason is I have the power set to soft off after 4 secs of depression. Maybe the power switch is shorting for that long, but, when I am in Windows, it will then start a normal shut down instead of just turning off.

    Anyway, I'm going to play around with the RAM some more since I think the MoBo - RAM relationship seems to be at the top of the list. Just to make sure, I'm going to unplug the reset switch and plug the computer directly into the wall.
Ask a new question

Read More

Homebuilt Computer Systems