Single-Bit Errors Not Detected by Memtest+

EddieAtherton

Honorable
Jun 28, 2012
2
0
10,510
Hi,

I've started to see a rash of single-bit errors on my system. I first noticed this when I had to copy the contents of a disk to a new one. In order to make sure the copy was good, I ran md5deep across the source before I copied, and then on the destination after the copy. This started to show some weird results where I'd get different checksums for the same file, If I ran it multiple times. After finding a few files that really were different between source and destination, I compared them and found single-bit errors between them.

My first thought was a memory issue, so I ran both Microsoft's memory checker and Memtest+. Both ran flawlessly over an extended period of time. However, the issue with different checksums for the same file persisted.

Next I pulled the 2 sticks of "no name" brand memory from the system, leaving behind the 2 sticks of Corsiar. Again the memory tests worked perfectly. But this time, so did the checksums on the disks. I was able to run those many times, without a single "mismatch". Problem solved, or so I thought.

Yesterday I downloaded some stuff from Usenet and was surprised to see a high percentage of the files failing a par2 check. Running the par2 recovery didn't seen to help, as it still reported many errors. By using WinRAR's recovery I was able to compare a bunch of the "broken" files with the recovered ones. Yep, every single one of them had a single-bit error.

So, how do I track down what's causing this, and is there a good tool to detect, as Memtest+ doesn't seem to find them.

My system specs are:

Asus Crosshair II Formula
AMD Phenom X4 9600 Black Edition
(Currently) 2 X Corsair CM2X2048-6400C5DHX

Cheers.
 

EddieAtherton

Honorable
Jun 28, 2012
2
0
10,510
Further searching in the forums, found a post that suggested I might be better off raising the DDR voltage a tad, so I thought I'd try that to see if it helped. Except I don't know if the voltage has been changed or not. The BIOS shows it has, but CPUZ still shows the original value:

BIOS1.jpg


BIOS2.jpg


CPUz.jpg


So, which is telling the truth. :D

Cheers.
 

rusabus

Distinguished
May 19, 2007
191
0
18,760


CPUz is just reading the values from the SPD chip on your memory modules. That chip tells the motherboard what to do when it is set to "auto" configure the memory. From the look of things, your memory modules should be run at 1.8V. 2.0V may be too much for them, and you may have better luck if you drop the voltage back down.

That said, your data corruption issues are just as likely to be HDD related. Memtest 86 and the Windows memory diagnostic tests will both find single-bit memory errors. Have you run any full scans of your hard drive? I would bet you have a few bad sectors. I'd suggest using the DOS-based utility from your HDD manufacturer. Most of their utilities can be burnt to a bootable CD and can run a full media scan (and even remap bad sectors, essentially fixing the drive).

--Russel