How critical is one Memtest86+ Error

zennehoy

Distinguished
Feb 11, 2004
17
0
18,510
Hi all,

I just put together a new computer, and am having quite a few problems with - data corruption? - something.

Installing Windows 7 x64 was mostly a matter of luck, as I kept getting the corrupt file error 0x80070570. It kept failing at different points of the install process, so I kept trying, and finally managed to get it to complete - with quite a few misgivings.

Installing drivers and some other stuff mostly went fine, except for Bioware's Dragon Age, which had similar problems as windows - data errors at random places in the install - until I got lucky after 20 or so tries.

While Windows mostly runs stable (except for a corruption of Avast Antivirus that forced me to un- and reinstall it), more intense applications (e.g. Dragon Age) tend to crash fairly frequently.

Obviously I considered bad RAM, so I let Memtest86+ run today. In 19 passes running for almost 11 hours I got exactly 1 error in one of the passes...

Considering how poorly the system behaves otherwise (e.g. display driver periodically crashing, even when on desktop) I somehow can't believe that something as rare as once in 11 hours is causing all that trouble.

A HDD tests off the UBCD ran without a hitch, of the CPU tests only the Mersenne Prime Test (v24.14) fails (within 10-15 seconds) and tells me about it, but that may be a faulty test? Memtest86+ failed at 3189.6MB if that's any help, but I haven't been able to repeat it even when testing just that range of memory.

Any help as to what I can do to resolve this would be greatly appreciated! Or can it be the RAM even at those odds?
Thanks!
Zen

p.s. No overclocking, though I had to manually specify the memory settings (defaulted to 1066MHZ 7-7-7-16 @1.5V rather than 1333MHZ 7-7-7-20 @1.65V)

System Info:
Intel Core i5 750
Gigabyte GA-P55A-UD4 Motherboard
OCZ PC3-10666 Platinum Low Voltage RAM - 2x2GB
Gigabyte GeForce GTS250 1024 MB
BeQuiet! Pure Power 530W
Seagate 500GB 7200 Hard Drive
LG GH22NS50 DVD Writer
 
This isn't like 1 dead pixel on an LCD monitor. If I get an error running memtest86+ at stock settings (including the RAM's manual settings) - it is RMA time. It may be that exact address that those programs keep trying to use that is corrupting your installs. RMA for replacement, run memtest86+ for no errors @stock settings (including the RAM's manual settings). If no errors reinstall everything.

If you overclock, set your system to boot to the media with memtest86+ (USB stick, CD, floppy), and let it test your overclocked RAM before you let it try to boot to your OS.
 

ekoostik

Distinguished
Sep 9, 2009
1,327
0
19,460
1 error is still an error. Somethings causing your problems and right now thats the biggest lead you have to go on.

Of course that PSU seems suspect as well.

If you haven't done so already, you should update your BIOS to the latest. A lot of stability issues and memory compatibility problems have been addressed since these boards were released.

What temperatures are you getting, in idle and when running full load? P95 failing so quickly is not a good sign. Is your heatsink seated securely? You should see the push pins extending through the back of the motherboard.
 

zennehoy

Distinguished
Feb 11, 2004
17
0
18,510


Yeah, it's going back. Just want to try and discover if there's any other component I should RMA so I don't keep waiting for new components to arrive...



That's exactly the problem - finding a bug is so much harder than fixing it :)



Hmm, that PSU got pretty good reviews all around. Since I switched to BeQuiet a few computer generations ago I've never had a problem there. Have you had any bad experiences with BeQuiet?



I really don't want to do any sort of firmware updating until I can be fairly sure that the updater doesn't get corrupted half way through...



Idle temperature was somewhere around 40, directly after mprime failing it was maybe at 48 degrees, though I tried some of the other CPU stress tests on the UBCD as well and didn't look too closely at the temps. I've definitely not seen it pass 50 though, and am very sure that it's seated properly.

I started another memtest this morning - curious what the result will be when I get home tonight.

Are there any other (freeware) tests I could run that factor in the entire system? So far I've only been able to test CPU, HDD and RAM separately (though I guess CPU tests need memory too...)

TC,
Zen


 

It's not an Antec, Corsair, or Seasonic.
 

ekoostik

Distinguished
Sep 9, 2009
1,327
0
19,460

Check out Prime95. It has a variety of tests to throw at your system, Blend, Torture, Large FFTs, which mixes a focus of CPU and RAM. I like to use this to test Turbo mode on these new chips. You can spin up one thread and with CPU-Z open watch your multiplier and processor speed shoot up: http://www.mersenne.org/

Also if you really want to stress test your system check out IntelBurnTest, it looks like v2.4 is the latest: http://www.softpedia.com/get/System/Benchmarks/IntelBurnTest.shtml

Ok, I haven't really gotten you away from CPU and RAM tests. PCMark and SiSoftware Sandra are other good utilities. Not sure what's freeware but they may have a test/trial version.
 

zennehoy

Distinguished
Feb 11, 2004
17
0
18,510
Alright, so I had memtest run all day again today, and it gave me 4 errors at the same address as yesterday.

I also tried out Prime95 under Windows 7, of which the first test (low memory usage) runs fine, the second test (some memory usage) errors out by about 30 seconds, and the third test (fair share of memory testing) errors out within the first second or so.

Taking out the second RAM stick and running just with the one in slot 1 (which I assume takes on the lower addresses, so should be the fault-free one) let me get through the "Expanding Windows files" bit of installation without a hitch first try. So I'm pretty sure it's that one bit error, which probably messed up some system files which triggered other errors.

Sending the RAM back tomorrow, hopefully a clean install with new ram will fix my stability problems...

I'll keep you posted.
Thanks!
Zen
 
You might want to run memtest on each stick individually to identify the stick in error.

Then, run the good stick in each of the ram slots to verify that there is no defect in the slots themselves.

When running prime95, let it run long enough for the temperature increase to stabilize. Speedfan can chart the temps.
Also, watch the multiplier on cpu-z to be certain that it does not drop. If the cpu gets too hot, it will lower the multiplier to protect itself.
 

zennehoy

Distinguished
Feb 11, 2004
17
0
18,510


Unfortunately the sticks are already in the mail... But I doubt it's likely that a defective ram slot would result in bit errors at just a single address? On three separate memtest runs that gave me errors with different memory timings, it's always been the same address that failed.

Thanks for the tip though, I'll keep that in mind for future troubleshooting...
 
The odds of a ram slot failing are very, very small. I'd chalk everything up to bad RAM, as that basically covers the failed installes, etc.

I had the same thing happen when one of my sticks started to go bad; first I got .md5 (think Checksum) errors and failed installs, which was solved once the RAM was replaced.
 

zennehoy

Distinguished
Feb 11, 2004
17
0
18,510
Well, it's been a while (RMAing the memory took forever), but I thought I'd post a final answer:

YES, one Memtest86+ Error is CRITICAL!

I am still surprised that it took Memtest86+ several hours to find a memory error that manifested itself within minutes of normal operation, but so it is.

With new RAM everything has been running stable as a rock for two weeks.

Thanks for all your help!
Zen
 
Good job.
You did well by being persistent enough to run the test for several hours.

Memtest runs a number of passes using various access patterns and data. Most f the time bad ram is identified on the first pass.
Sometimes the ram fails only with a specific access pattern and data. It can take a long time to get to such a specific test. When you run your system, it will excute the same way every time, and when it uses the specific combination of access and data, you will get the failure in the same place at the same time.

Since you are now running well, I advise against trying to update your bios. A failed bios update can render your motherboard permanently useless. Update the bios only if the update fixes a problem that needs to be fixed. If you do, research and make certain you do it properly. Connect your PC to a UPS if you can while doing the update to preclude the possibility of a power failure in the middle of it.
 
Everything the OP has posted points to bad RAM.

RAM is touchy; I chalk this up to failing, but not failed RAM. It sometimes works (or at least gives the proper value), so in a testing environment, the odds of failure is low. But in actual use, where data is being cycled every few seconds, the odds of detection are much, much higher.