My system specs: Note the MB, CPU and Video card were purchased back in late March. All other components were used in my previous computer w/o a single problem.
Windows 7 Enterprise 64bit
AMD Athlon II x4 630
Gigabyte GA-MA785GM-US2H, rev 1.1, F7
Crucial Ballistix 4GB Kit (2x2) DDR2 800 BL2KIT25664AA80A, 4-4-4-12
Gigabyte GV-R467ZL-1GI Radeon HD 4670 1GB 128-Bit DDR3
Seagate 320GB 7200RPM 16MB cache
WD Caviar 250GB 7200RPM 8MB cache
LG DVD Burner
FSP 450 Watt PSU
Coolermaster Midtower Case
Simple history of the problem: This has happened several times the past 3 months. Build computer, install windows 7, install latest gigabyte chipset driver (which includes catalyst control center), install latest ati driver (which includes catalyst control center). Everything runs great for about 2 or 3 weeks then, mostly when I am away from my computer, the computer has rebooted and when I log back in it says it has recovered from a serious error. If I am sitting at my computer when it happens, I see a BSOD with memory dump then the computer restarts. The BSOD/reboot happens so fast I cant make anything else out. These BSOD's will happen for a day or two then it will be fine for a week and start again.
My most recent problem: On tuesday, sometime during the day my computer had a BSOD b/c when I came home I had to log back in to my computer. Whe I logged in it said windows had recovered from a serious error. After a moment another message popped back up saying "windows explorer" had stopped working. Then a message popped up saying "windows explorer" was restarting. The system was caught in this "windows explorer" start/stop loop for a minute then it BSOD'ed. When the computer restarted, I entered safe mode just to get into the system. Rebooted the machine and have been using the computer just fine since then.
Last night I ran memtest86+ v4.0. Immediately, it reported 50 errors in the first few moments. I rebooted the machine and changed my memory timings from 4-4-4-12 (my memory spec) to 5-5-5-18 (not spec) and re-ran memtest. After about 5 minutes memtest reported 9 errors.
I called Crucial's tech support and the tech asked if I ran memtest on a hot system or a cold system. It was being run on a hot system. Meaning my computer had been on for a long while, couple days at least. He told me to shut my computer off and let it cool for 30 minutes then run memtest. He said that if after running memtest on a cold computer and you recieved no errors then it is a heat issue and you will need ram fans. If you recieve errors then it is most likely a bad stick of ram. I have since run memtest overnight for almost 8 hours and no errors have been reported.
System Configuration: I have followed bilbat's Gigabyte guide found in this forum. My ram voltage is running at 2.0V, I am not running AHCI, and I am not gaming with this current setup yet (I am not taxing the system when I am using the computer). I am running MS Security Essentials and have run it several times and no problem has ever been found.
Advanced BIOS Features
Internal Graphics Mode [Disabled]
x UMA Frame Buffer Size [Auto]
x Surround View Disabled
x Onboard VGA output connect [Auto]
AMD C1E Support [Disabled]
AMD K8 Cool&Quiet control [Disabled]
} Hard Disk Boot Priority [Press Enter]
First Boot Device [USB-HDD]
Second Boot Device [CDROM]
Third Boot Device [Hard disk]
Password Check [Setup]
HDD S.M.A.R.T. Capability [Disabled]
Away Mode [Disabled]
Backup BIOS Image to HDD [Disabled]
Init Display First [PCI Slot]
OnChip IDE Channel [Disabled]
OnChip SATA Controller [Enabled]
OnChip SATA Type [Native IDE]
x OnChip SATA Port4/5 Type IDE
Onboard Audio Function [Enabled]
Onboard 1394 Function [Enabled]
Onboard LAN Function [Enabled]
} SMART LAN [Press Enter]
Onboard LAN Boot ROM [Disabled]
OnChip USB Controller [Enabled]
USB EHCI Controller [Enabled]
USB Keyboard Support [Enabled]
USB Mouse Support [Disabled]
Legacy USB storage detect [Disabled]
Onboard Serial Port 1 [3F8/IRQ4]
Onboard Parallel Port [378/IRQ7]
Parallel Port Mode [SPP]
x ECP Mode Use DMA 3
PC Health Status
Hardware Thermal Control [Enabled]
Reset Case Open Status [Disabled]
Case Opened No
DDR2 1.8V 2.048V
Current System Temperature 36oC
Current CPU Temperature 41oC
Current CPU FAN Speed 1962 RPM
Current SYSTEM FAN Speed 0 RPM
Current NB FAN Speed 0 RPM
CPU Warning Temperature [Disabled]
CPU FAN Fail Warning [Disabled]
SYSTEM FAN Fail Warning [Disabled]
NB FAN Fail Warning [Disabled]
CPU Smart FAN Control [Enabled]
CPU Smart FAN Mode [Auto]
System Smart FAN Control [Enabled]
MB Intelligent Tweaker(M.I.T.)
} Advanced Clock Calibration (Note) [Disabled]
x Value (All Cores) -2%
x Value (Core 0) -2%
x Value (Core 1) -2%
x Value (Core 2) -2%
x Value (Core 3) -2%
CPU Core Control [Auto]
x CPU Core 2 [Enabled]
x CPU Core 3 [Enabled]
HT Link Frequency [Auto]
CPU Clock Ratio [Auto]
CPU NorthBridge Freq. (Note) [Auto]
CPU Host Clock Control [Auto]
x CPU Frequency(MHz) 200
PCIE Clock(MHz) [Auto]
VGA Core Clock control [Disabled]
x VGA Core Clock(MHz) 500
Set Memory Clock [Auto]
x Memory Clock x4.00 800Mhz
DCTs Mode (Note) [Unganged]
} DRAM Configuration [Press Enter]
******** System Voltage Optimized ********
System Voltage Control [Manual]
DDR2 Voltage Control [Manual] +0.200V 2.000V
NorthBridge Volt Control [Normal]
SouthBridge Volt Control [Normal]
CPU NB VID Control (Note) [Normal]
CPU Voltage Control [Normal]
Normal CPU Vcore 1.3250V
DDRII Timing Items [Manual] SPD Auto
x CAS# latency 4T 5T 5T
x RAS to CAS R/W Delay 4T 5T 5T
x Row Precharge Time 4T 5T 5T
x Minimum RAS Active Time [12T] 18T 18T
x 1T/2T Command Timing 2T -- --
x TwTr Command Delay 3T 3T 3T
x Trfc0 for DIMM1 127.5ns 127.5ns 127.5ns
x Trfc2 for DIMM2 127.5ns 127.5ns 127.5ns
x Trfc1 for DIMM3 Auto -- --
x Trfc3 for DIMM4 Auto -- --
x Write Recovery Time 6T 6T 6T
x Precharge Time 3T 3T 3T
x Row Cycle Time 24T 23T 23T
x RAS to RAS Delay 3T 3T 3T
That's interesting. I guess it does sound like a RAM issue, possibly from heat. Does your case have good cooling? (IE: at least 1 intake and 1 or 2 exhaust fans)
If possibly, I think the easiest thing to do would be replace the RAM. If you can try a pair of other RAM sticks just to check if you get Memtest errors or not. But yeah, otherwise I guess RAM cooling might be the next best solution if you don't want to have an unusable system while you RMA the RAM. I guess another option would be to try lower RAM voltage and maybe it won't heat up so bad.
Blue screen errors are normally caused by either faulty memory or bad drivers (particularly video drivers) in your case however the blue screens are almost certainly caused by faulty memory. If you can’t fix the problem by changing the timings or increasing the memory voltage then it could be that your memory is incompatible with your motherboard (this happens sometimes) try another brand of memory that has been tested as being compatible with your motherboard. The manufacturer of your motherboard normally lists the tested compatible brands of memory on their web site.
Update: came home from work and memtest has been running for over 17 hours without an error.
@ wolfram23 - i have one 120mm case fan in front and one 120mm case fan in the rear. I have run the ram at 1.8volts and still had a BSOD. I don't know if I will get memtest errors or not. But right now I am not getting any memtest errors.
@pjmelect - i agree but what would cause this most recent scenario: Windows running for several days (computer constantly on) without a problem. Restart the machine and put the memtest cd in. Memtest reports several errors immediately. Shut the computer off for an hour or so. Start it up and run memtest for over 17 hours w/o a single problem.
The fact that you get memory errors when the machine is restarted but is OK when it is hot suggests to me that a faulty power supply is the culprit. Faulty power supply issues can cause some funny symptoms.
Another indication of a faulty power supply is the location of the memory errors if the errors are at the same addresses every time you run memtest86 then this points to faulty memory where as if the memory errors occur at random locations every time you run the memory test then this indicates that the power supply is the culprit.
Went ahead an ordered a new power supply this morning. My current PSU is 5 years old. Even if I don't need it, it's time to get a new one.
When I get the new PSU. I am going to reinstall windows 7. I am not going to install the drivers from gigabyte or ati. I am going to let windows manage the driver situation. The last 3 times I have installed windows 7 on this computer, windows has found all the hardware without a problem. In all cases I decided to install the gigabyte chipeset drivers and ati graphics driver. I am going to see how that works out.
But, I will keep you all posted as to what has happened. It may take a week or two for any problems to crop up, if the PSU was not the problem. Seems my computer is great for a week or two, then I get random BSOD's, then everything is fine for a while.
I appreciate everyones help and advice. If anyone has anything to offer please feel free to post.