I feel like every single component of my server is breaking simultaneously. (Plus Windows Storage Spaces.)

Hi guys, it's been a while. I could really use your help with this one, though - I feel like the water is right about up at my head.

About one month ago, the breaker to my house was flipped (by a third party). Two computers were running at the time, and while "my" computer had absolutely zero issues, my HTPC / server wasn't so lucky.

After some troubleshooting, I determined that one of the four 4TB WD Red drives was faulty, and returned it for an RMA. These drives are being used in a Windows Storage Spaces pool, set up as the equivalent of RAID 5.

After I replaced that drive, the pool was rebuilt, optimized, and seemed to function normally. Fast forward to three days ago, when I go to reinstall Windows. (Windows 10 Home x64.) At first, right after the install, everything worked fine. Then, however, I went to gain permissions to all the files in the Storage Spaces pool using the "Take Ownership" script.

It didn't like that. Since then, the computer (and any other computer with these four drives in it) takes about half an hour to boot into windows.

Once in Windows, the Storage Spaces pool reports itself as "offline due to a critical write failure, please add drives," and when I try to bring it online, Explorer freezes and eventually crashes. (And I can't bring it back using the task manager.)

Between the slow startup and the issues with Storage Spaces, I have a bad drive, right?

Except that, individually when plugged into a dock on my other computer, each drive reports its S.M.A.R.T. data as being totally healthy.

Well, after talking to Western Digital, they wanted me to run their Data Lifeguard Diagnostics program on each drive, in the original system.

I run the quick test on the first drive, and it passes, no problem. I go to run the extended test, and it says it'll take six hours.

Six hours later, I get up to take a screenshot of the results, and to start it on the next drive. It claims that the extended test was a pass - great! I open snipping tool, to save the results, but before I can press 'new,' the entire screen gets covered in a checkerboard of small translucent squares. Unfortunately, before I can grab my phone and take a picture, the system crashes and then tries to start POST again.

So... That's a graphics card failure, right?

Only I've also been having issues with the motherboard - it's occasionally giving me an error where it loses the system date and time, and worse, occasionally gives an error claiming that a CPU change was detected, and to press y to continue. These problems were not resolved by flashing to the latest bios revision.

So... It feels like everything is crumbling around me. I can't afford to replace this computer right now, and I can't afford to pay a data recovery specialist to save the 7.8 TB of files that I had on this server.

Specs are as follows:

Windows build 1803

Pentium g4400

Asrock Rack C236 WSI Mini ITX Server Motherboard

16GB of Kingston ValueRAM 2133MHz DDR4 ECC (unbuffered) Server Memory

Nvidia Gtx 750ti

Toshiba OCZ TR150 2.5" SSD

4x Western Digital Red 4TB HDDs

Thank you so much for your time and any direction I should proceed with troubleshooting.
 
Solution
I Nagree, change the cmos battery, reset back to factory defaults, save and exit.

Then go back in and make the required changes to meet your system configuration.

Could be you have bad configuration data and it's just screwing things up, seen it before.
I Nagree, change the cmos battery, reset back to factory defaults, save and exit.

Then go back in and make the required changes to meet your system configuration.

Could be you have bad configuration data and it's just screwing things up, seen it before.
 
Solution


... That's a thing, isn't it. I'm going to feel really dumb (but super happy!) if replacing the CMOS battery does the trick.

You know, I even tried removing the battery before flashing the new BIOS, but didn't think to try replacing it.

I'll do that when I get a chance to go buy one, and report back later today. Thank you!

(Though that still wouldn't explain why the drives cause other computers to take 20 minutes to boot - but hey, if I can at least resolve one problem, that'll make the other much less frustrating.)
 
Hi guys, I just wanted to say thanks again for your suggestion - it appears to have been spot-on!

I replaced the CMOS battery, ran memory tests, and everything appears to be good.

The hard drives are currently in an external dock on another computer, having WD Data Lifeguard Diagnostics run on them, which will take a few days at 6+ hours apiece.

I'll update after that finishes.