Ultra DMA CRC Error Count, cable issue?

Master Philip

Reputable
Oct 28, 2015
6
0
4,510
About a month ago my PC (running Win 10 Pro) started to crash randomly and when copying some files, I noticed there was an issue with file corruption on the target drive. This then behaved for a few days and then escalated to the system freezing completely. When the system wouldn't restart, I was forced to manually reset the PC, and this then prompted a lengthy DOS check disk on boot up. The screen was running with what seemed forever showing "Deleting extended attribute set due to the presence of reparse point in file ######" No bad clusters were found. But the system was still unstable and crashed a few more times after this.

I then booted into Win 7 on my other HDD on this system to see if I could figure out what was going on. I ran HD Tune and let it do a full media scan, no errors on disk surface, but in the health section it showed a high value for the Ultra DMA CRC Error Count (this eventually increased as I continued battling to pull data off the affected drive). I also have to point out that both the affected Win 10 HDD and the Win 7 HDD are both the same type, that being WD 1TB Blue SATA, the former being less than 3 years old and the latter a little over a year old now.

After reading that it could be a SATA cable issue, or bad clusters, or the HDD controller on the drive, and then realising that the warranty for these drives is only 2 years now, I decided to play around with the cables and troubleshoot a bit. Since disconnecting and reconnecting the SATA cable to the affected drive, I have also done a full reinstallation of Win 10 Pro after formating this drive with newly created partitions from the Win 10 Pro installation. So far, the Ultra DMA CRC Error Count has not changed since I played around with the cable, and the OS on the drive appears to be running stable.

When the drive started to act up, it was running a bit low on space, but not to the point where it could no longer function - is it possible that this was all caused simply due to the cable and the way it was plugged in at the moment? Maybe some dust? I do clean my PC from time to time, but it does sometimes become a bit dusty. HD Tune shows the overall health as OK, and all the attributes apart from the Ultra DMA CRC Error Count (Current 200, Worst 1, Threshold 0, Data 403274) appear normal. The WD Lifeguard diagnostics app also shows the drive as being in perfect health, and in fact doesn't even show anything wrong with the Ultra DMA CRC Error Count.
 
Yes. I've had similar issues caused by cables.
Even though the drive is showing as fine by the mfg. software you should still make sure you have a recent backup. Beyond that just use the drive as you would normally and maybe check the CRC error count every once in a while.
 

Master Philip

Reputable
Oct 28, 2015
6
0
4,510


Fortunately I was able to backup everything off the drive via Win 7 onto an external drive, so all I can do is start using it and keep an eye on that CRC error count. I can only hope that it was just one of those random and unfortunate things - at least now I have a fresh installation of Win 10 Pro - it was still the upgrade installed over Win 7.
 

Paperdoc

Polypheme
Ambassador
Agree with the cautionary notes above. But yes, that problem could have been caused simply by a cable problem.

One of the common types of trouble I've see - and fortunately, easy to correct - is caused by slow oxidation of the surfaces of the metal contacts inside connectors, and of the fingers of the mating male connectors. The oxide layer acts as an insulator, yielding poor and inconsistent electrical contact resulting in bad data transfer. The fix can be this simple. With the system shut down, do this to all related cables where things connect. Unplug, then re-connect, Repeat several times. Go on to the next. Make sure to do BOTH ends of a cable. Be careful not to damage or dislodge any other connections. When done, re-inspect everything looking for out-pf-place or loose things. Then close up and boot.

This action can "scrub" the oxidized surfaces clean so they work again. It is not permanent - you might have to repeat in a year or two. It is easy to do, costs nothing, and harms nothing if you don't break things by mistake. Thus it's often my first stab at fixing intermittent problems. If it does not solve the issue, you know to look for other stuff.
 

Master Philip

Reputable
Oct 28, 2015
6
0
4,510


Thanks! It makes sense that this could have been the case. My PC is doing very well for its age, it will be 9 years old towards the end of the year, but being a first gen i7 with a decent MSI motherboard and other components used, it is still a capable machine for its age. At some point I will need to replace it, but while it's still usable and capable, I can wait a bit longer. I think with my fiddling I most likely did exactly what you said above, maybe a combination of dust and some oxidization in the contacts was causing this. I will most likely give the machine a semi breakdown and pull various components out, cables etc, and give everything a nice clean. That usually gives an older machine a new lease on life.
 

Paperdoc

Polypheme
Ambassador
You're welcome. I tend to keep things repaired and they get old while still working well. This machine is older than yours - how's that for a weak brag? In my family, cars tend to be over 15 years old before they have so many problems I drive them to the scrap yard. Right now our two oldest are 1999 model year.
 

Master Philip

Reputable
Oct 28, 2015
6
0
4,510


I am the same. I prefer to make things last as long as they can. None of my cars are less than 10 years old interestingly, but all in good and reliable condition. My PC before this current one, was one that I built back in mid 2001, and apart from a few changes over the years, added a HDD here and there, more RAM, GPU and sound card, and loads of cooling fans, it ran reliably until mid 2016 when at first it seemed like the PSU was giving out, and then it did finally. So I replaced the PSU and the system started to crash after being on for a while, which eventually got to the point where you could not load any sort of GUI type system. Even booting into a Linux boot installer would cause it to crash. I literally pulled the system apart and had the motherboard (Asus A7V-133 with AMD Athlon Thunderbird 1.3 Ghz CPU) with all its peripherals mounted outside the case in order to troubleshoot and hopefully determine which device / component was causing this. I tried pulling individual RAM modules, disconnected all HDD's and other drives, put the old GPU from years ago back in, removed the sound card. I then even went so far as to removing the CPU and heat-sink to clean and apply fresh coolant paste. Unfortunately nothing has worked. So I can only suspect that the CPU or motherboard itself has decided to kick the bucket. It's sad, because I have been able to keep that dinosaur running for so many years, and even the old 80 GB Seagate ATA-100 HDD is still in perfect health.