Sign in with
Sign up | Sign in
Your question

Replaced Motherboard, but RAID BIOS Still Won't Load

Last response: in Overclocking
Share
September 13, 2008 10:31:42 PM

I have the following system configuration:

Gigabyte GA-P35-DQ6
XFX PVT80GTHF9 GeForce 8800GTS 640MB
Intel Core 2 Quad Q6600
Mushkin Hp2-6400 Ddr2 4gb Kit
Four Western Digital Caviar HD 500G|WD 7K 16M SATA2 WD5000AAKS
SILVERSTONE TEMJIN SST-TJ06S-W Silver Aluminum
Seasonic S12 Energy Plus SS- 650HT Power Supply
LG GGW H20L Internal Blu-ray Disc Rewritable Drive
HP LP3065 30" LCD Display
Turtle Beach Montego DDL 7.1 Dolby Digital
Pioneer DVR-112D
ZALMAN 9700 LED 110mm 2 Ball CPU Cooler
Contour Shuttle-Pro


This system is one year old. It has been clocked at 390MHz FSB for most of that time, and has been 100% reliable except for two instances where I powered the machine down overnight as an energy-saving effort and powered up the next morning only to find non-booting status/BIOS hang.

The first incident happened June 10 and it resolved itself after two days of rebooting, clearing CMOS, reprogramming BIOS several times, clearing CMOS several times, upgrading and then downgrading the BIOS, and then.. suddenly the machine was back to normal.

Three weeks ago, I dared to power it off over night on a Monday, again. The next morning, Tuesday, I powered on and Windows booted to a BSOD.

To make a long story short, I discovered the RAID BIOS was no longer loading, therefore I lost access to the Windows swap file (on RAID volume), about 950MB of client video and my LG Blu-ray burner, all on the RAID controller ports.

After much experimenting in BIOS, I discovered that the RAID controller BIOS will only load since Tuesday IF the CPU Host Clock Control is DISABLED.

After two weeks of clearing CMOS, upgrading BIOS, playing with every option imaginable, I have been unable to gain access to the RAID controller, so I concluded that the motherboard was shot from a power on glitch. (I had already disconnected everything but the CPU, RAM and graphics processor.)

So last week I bought a replacement P35-DQ6 motherboard and swapped out the board today. I was certain this would solve my problem since the old board seems to have a bad RAID controller, right? Well, it made no difference at all. I still have "RAID BIOS not loaded!" appearing during boot process.

That leaves me wondering if I have the odd luck of getting a new "defective" motherboard with exactly the same defect, or if my GPU, RAM or CPU are somehow able to fake a motherboard RAID controller failure?

To test this, I will have to purchase these three items to replace what I've got now.

So I have to look for the most likely cause of RAID controller not loading when CPU Host Clock Control is enabled. So what do you guys think?

CPU?
GPU?
RAM?

I've already swapped the RAM modules between sockets, hoping that might reveal something, but no difference I could detect.

This system ran beautifully at 3.51GHz for 12 months, until I shut it off a couple Tuesdays ago. Now it will only run at the standard 2.4GHz settting, unless I don't care to access the RAID devices.

This is a puzzling behavior which screams "motherboard failure", but as I've replaced the motherboard and still have the problem, I am down to replacing the last three items.

I've got digital photos of each BIOS screen that I saved from when the system was working properly, so I was able to recreate the exact settings for a working overclock, but the problem is the system won't load the RAID BIOS anymore with these settings (or any other settings that involve enabling CPU Host Clock Control.)

Any ideas what would cause a RAID BIOS not to load with even a slight overclock and the PCI buss locked at 133MHz?
September 14, 2008 12:31:48 AM

I'm not going to chastise your for not backing up your client's work on to a non-raid drive, but this post from SomeJoe7777 might help you.
SomeJoe7777 said:
You use Runtime.org's RAID Reconstructor to destripe the drives to an .img file on a 3rd drive. Then you use Runtime.org's GetDataBack for NTFS to pull files out of the .img.
Other than that I would suspect the CPU then the RAM, given the abuse. You would think the controller is the same, but I confess I don't do RAID for this very reason.

Run prime 95 small FFTs and memtest86+ and see if you can invoke a failure.

PM SomeJoe7777 to get more details on the saving your butt thing.
September 14, 2008 1:01:52 AM

I started getting suspicious of the PCI bus, given that only the RAID BIOS was failing to load on overclock. So I started manually walking it down in frequency until the system booted and the RAID became visible. That point was 120MHz, down from the "default" of 133MHz. In "auto" it would not work either.
But there is some confusion about what the default PCIe clock should be. Some web sites say it's 133Mhz. Others are saying 100Mhz.
Why my system worked before and then lost its CMOS settings is in question now. I could have sworn it was set to AUTO or 133MHz for the PCI clock.
At any rate, it seems to be working now with the clock downt o 100MHz. But I am not sure if I'm underclocking the PCI bus or not. What IS the correct speed for this clock?
Related resources
September 14, 2008 1:07:22 AM

Apparently I should have read more closely, 100 is what you want. I thought 125 was the maximum stable. Do you have links to the 133 recommendation? I would like to understand their reasoning. Is this on the uber gaming site?


I went back and looked again and sure enough you did say in the last line 133 PCIe bus, I should have caught that. I'm surprised it was stable at all.

If you want a gaming machine then you should get one. If you want a work machine you then should get one. Never the twain shall meet.

I'm all for OCing, but if your livelihood is at stake, use moderation.
September 14, 2008 4:08:27 AM

I think what happened here is that the CMOS, upon startup a few Tuesdays ago, inserts random values in some of the parameters. Certainly the PCIe bus was not running at that speed all year!
Back in Aug 2007, I had thoroughly vetted out 3.51GHz as a conservative, safe operating speed. The machine could push to 4GHz, but was thermally-limited, so I chose 3.51 to stay under 70ÂșC on a 24-hr Prime95 torture test. Memtest86 ran 24 hours with no errors too, so I was satisfied with that.

I did run across some reports that F4 BIOS was unstable and tended to lose CMOS settings sometimes. Hopefully now that I upgraded to F7 BIOS, it should not change CMOS values when I power up the machine.

What really threw me off for three weeks was the fact that I saw that 133Mhz in there and had been convinced that's a normal number. How it randomly got there from a power cycling event, I may never know!

Back in June, when I'd had the other incident where the machine would not boot, the PCI speed set itself to AUTO and the PC would not boot if the host clock was not set at exactly 266 FSB. It took me a while to notice the AUTO setting and I guess I changed it to 100MHz that time and all was back to normal. This time, instead of setting itself to AUTO, it set itself to 133Mhz at power on a few Tuesdays back. And I've been chasing my tail with this ever since. How the CMOS can randomly change certain parameters at power on like that is baffling. This is the first motherboard I've owned that did not retain its settings. Hopefully F7 BIOS will clear up these random losses of CMOS data.
September 14, 2008 8:40:42 AM

I have the watch over 3 of those mobos. I have seen some freaky stuff on one of them. Is it RAM? Maybe.



No uberclock though.

Oh wait, I can assure you that it was never a 133 PCIe.

No offense, but if you know better, then why?

Still waiting on the links to the sites that recomend 133. Can you work that out?
September 15, 2008 2:02:54 PM

I read on one of the forums recently that the F4 BIOS is flakey and does things like this. I've upgraded to F7 and I'll be watching it carefully from now on.

From what I can see now, it was never supposed to be 133MHz, however, the number didn't stick out as unusual, so I never even noticed it, nor did I ever suspect that the CMOS would change the clock value for the PCI bus all by itself. If it had put in some truly random number, like 147, I would have noticed the odd number and fixed it right away. But 133 looked like a typical value I'm used to seeing in CMOS screens, so I never gave it a second thought.

The odd part that still has me pondering the whole situation is why did the CMOS change the PCI bus from 100 to 133 when I powered down/up the machine overnight? What mechanism allowed that?

I spent hours combing through scores of web sites and I didn't bookmark the site that suggested 133 (or any of the sites for that matter) and since I don't have three hours to spend looking for that site again, I don't feel that's a $$$ efficient way to spend my time. I've already lost two weeks of productivity due to this mess and I'm backlogged with editing work and clients calling on the telephone asking where we stand with projects. Sorry.
September 17, 2008 12:35:06 AM

Yeah, 133 is off the charts. Who knows why these mobos do some of the flaky things they do. With the higher OCs they seem to get more flaky, but that's a purely subjective opinion.
September 17, 2008 4:24:07 AM

Yeah, there's all sorts of complex activities going on in any modern PC motherboard and only the design engineers may have a realistic understanding of the nuances of board operation (and even then they probably regard some apsects as a black art).

I got my replacement PSU from Seasonic RMA today and installed it. System runs fine, but a lot hotter with the 650W than with the 430W of the same brand that I was borrowing from another system for the past week. When I put my hand over the exhaust on the smaller PSU, it was pushing out air. When I do the same on the larger PSU, I can't feel any air moving at all. The original 650W PSU that failed also ran it's fan so slow it didn't move any air that you could feel. I'm a little concerned because in the first three months, this system had two hard drives fail with bad electronics, not head crashes. I'm planning to add two more drives next week for another RAID volume and that will further increase heat output. I may have to hang an external fan off the PSU's exhaust vent to pull out the heat more quickly, as much as I hate to have that kludge on my nice workstation.
September 17, 2008 2:40:13 PM

Not pretty, but certainly an option.
!