BSODs while gaming or cpu stress testing

Zorus195

Reputable
Mar 14, 2015
3
0
4,510
Hello,

I had recently installed a new HDD, CPU + heat sink and graphic card on my computer and since I experience multiple BSODs while playing games or doing CPU stress tests. The BSODs error code are generally related to IRQL_NOT_LESS_OR_EQUAL or PAGE_FAULT_IN_NONPAGED_AREA (here a link to my most recent minidump files: Minidumps ). It happens every 5 minutes or after one or two days.
I noticed that if I unplug the PSU for about 5 minutes and re-plug it the problem seems to disappear for several hours and then come back (which seems very strange to me).
The problem doesn't seems to be heat related because my cpu temperatures stays bellow 70C while stress testing which I think is normal for my processor.
The problem seems to be related to the PSU or to a faulty driver.

SPECS:
- Alienware Aurora R1
- MOBO: Intel X58 MicroATX LGA1366
- OS: Windows 7 64bit
- RAM: 12GB (3x4gb) DDR3 1333Mhz
- GPU: 1.5 GB asus Dcuii GTX 580
- CPU: i7 920 OC @2.66 Ghz 1.1v (40C idle, 72C Max)
- HEATSINK: cooler master hyper vortex plus
- PSU: Alienware stock PSU (875W)
- HDD: Seagate Barracuda 1TB 7200RPM (ST1000DM003)

Here's a list of the steps I went through since my problem begin:
- Clean re-installing windows 7 OS
- Clean installation of the most recent nvidia drivers (347.52)
- 7 passes of memtest on each of my memory sticks (no errors)
- Prime95 and OCC stress test (I quickly get BSODs BUT if I unplug and re-plug the PSU the stress tests can run for more than 8 hours without any BSOD)
- I tried seatools for windows to look for HDD errors (no errors)
- I have updated to the newest drivers available on dell website.
- Unfortunatly I dont have any spare working CPU or PSU to swap with the current ones to see if it is a hardware related issue :(.

I am beginning to be out of ideas on how I could find a solution to my problem so any advice would be really appreciated.

Thank you,
 

Zorus195

Reputable
Mar 14, 2015
3
0
4,510
Yes the probability are high that the psu is causing the issue but I need to be sure before that it is the culprit before buying a new one.

- Here's my mobo serial number : Dell Alienware Aurora ALX LGA1366 System Motherboard MS-7591 (this motherboard has 2 - 3x ram slots so it can take 3 ram sticks)

Thank you for your answers !
 
I did not see any drivers in memory that were corrupted, just the data that was being read was wrong it could be low power supply voltage. I guess putting in a new power supply would be a good test.
you also had a few old drivers you might want to try to update but I don't think they would be the cause of the problem.

most of the bugchecks involved video, the others involved bad instruction pointers
You might disable the video card /replace with a known good one. (assuming that updated graphics drivers don't help)

--------------
Why unplugging the power supply and reconnecting would work: Thermal stress on solder joints can cause a joint to expand and contract over and over until the joint pops from its circuit trace. When power is applied the circuit heats up in 20 or 30 seconds and makes a proper connection and the machine works.
These are very difficult to find. The last one I found was under a heat sink on a memory module the machine would bugcheck every morning then run fine until the next day. When the machine was cold, the leg of the memory chip would contract away from the solder pad, when it got warm after about 20 seconds of power it would expand and make a good connection. Problem was the leg controlled a address line for memory and when it connected it would move the memory block by the logical value of the memory address bit the leg controlled. This resulted in various bugchecks in windows (data corruption)

Try this: When your machine is cold, boot to BIOS for 20 or 30 seconds, then reboot into windows and see if you can get a bugcheck.
(at least you will not have to unplug your power cable)

your system is pretty old, you can have a bad solder joint any where in the system. Also, your system is likely using the first generation of lead free solder, that solder is very brittle and the connections tend to break. It is one of the reasons many motherboard vendors reduced the warranty period. Too high of a failure rate in the long run.

I will look at your bugcheck memory dump to see if I can see any software problem

old drivers:
broadcom driver is old:
\SystemRoot\system32\DRIVERS\k57nd60a.sys Fri Oct 16 03:29:38 2009
http://www.broadcom.com/support/ethernet_nic/downloaddrivers.php

silicon image driver are pretty old (2007)
http://www.siliconimage.com/support/

3 bugchecks weren in graphics code
2 bugcheck had addresses that looked valid but were not.
1 bugcheck involved a bad instruction pointer but your driver don't seem to be corrupted in memory.

 

Zorus195

Reputable
Mar 14, 2015
3
0
4,510
Hello !
Please apologize for my late answer, I had been doing multiples tests to find out my problem and I finally find the culprit. I waited some days to confirm that no bsod would come back before replying.
So the problem was related to a memory issue (RAM). One of my DIMM on my mobo was faulty on cold boot. So as a solution I just had to remove the ram stick from this DIMM Slot and my computer is working fine!
Thank you for your help guys!