Hi all,
I'm not sure if this is the right forum for this; I can't see a better one but a mod can move it if it's more appropriate somewhere else. I posted this problem on experts exchange as well about a week ago, and it seems to have baffles a couple of people there as well, so I'm hoping you guys can provide more insight. Apologies for the extreme length of this, but the problem is so weird, and I've done a comprehensive set of investigations so far, and I know you guys will want as much info as possible, so here it goes. I'll first of all post my original core system specs so you know what we're talking about, then I'll give you the long history bit of the problem during the last week, but if you want to skip that initially, a summary of the situation is given at the end.
Thermaltake Soprano Black case
OZC GameXStream 600W power supply
MSI Neo2 P35 Crossfire motherboard
Q6600 Quad Core (previously overclocked to @3.2Ghz, but currently being tested at stock 2.4Ghz)
4*2Gb Corair XMS2 DDR2 800Mhz RAM (stock timings now)
2*Sapphire Radeon HD4850 512Mb PCI-E graphics cards.
Creative X-Fi Elite Pro soundcard
Texas Instruments 1394 network card (not actually used)
Samsung 22" 226BW monitor
--------------------------------------
LONG HISTORY
I run 32-bit XP as my main OS, and regularly make backups using Acronis True Image 8 (and for more recent ones, True Image 11, but it's TI8 restores that we're talking about). A few days ago, a spectacular virus attack got through the various defences of my homebuilt system (hardware firewall, software firewall, startup control, AV heuristics; it all fell apart), and left the XP installation unusable. After failing to get it completely under control again, I deleted the partition (after backing it up to an external drive in case I need anything from it in future), and decided to restore a disk image made 3 weeks earlier. The image was restored to the same disk that it originally came from, it was a full disk image restore (including MBR and Track 0) and that appeared to be successful. The restored XP installation was now bootable, and contained all the files that had been there when the image was made. So far so good.
Now the first problem: to my surprise, when trying to play a 3D game, it crashed. Tried another, it also crashed. Some others seemed fine: on investigation, the pattern was basically if 3D was stressed, major graphics corruption occurred which if unstopped would eventually lockup the system. Booting into Vista at the time, 3D graphics appeared to be fine; no crashes or lockups. A clean XP install similarly performed fine, but when I attempted to restore an earlier XP image, to a different hard drive, the same problem occurred again.
If that wasn't weird enough, things got a good deal stranger when I attempted to restore the original image once again (this is now the third image restoration). This time, it got to the Welcome screen fine, but I couldn't even login, with complaints that it couldn't find the profile, and shortly after that, complaints about the filesystem (which I imagine were both from the same underlying cause). I restored the other image, again to a different drive, and had the same problem. So now, for two different images on two different drives, both of them showed one lot of strange behaviour on the first restore (graphics failure under stress) and a different lot of strange behaviour on the second restore (filesystem problems).
At that point I was prepared to put it down to just image problems, although I verified the images in True Image both at the time I made them, and at the time I restored them, and they were fine. I can also browse and open the files within the images without any problems, so I don't believe they're corrupt, but in any case I tried restoring them once more, through my laptop now using an external enclosure, and sure enough, that seemed fine, I could login again! However, the system shortly became unstable, with lots of "missing" files (which weren't actually missing at all), causing many apps to fail. More filesystem problems. Restoring a different image to a different drive gave similar results again; always after restoration via the laptop I can login, but they quickly show these missing file problems.
Putting it down finally to bad images, I turned back to my clean XP installation, which was having the usual Radeon driver fun. Finally found some drivers (9.6) which seemed to accept both cards being present, and what did I find? 3D graphics corruption under stress. Meanwhile Vista on the same machine continued to be fine, including 3D graphics.
Scrapped that, reformatted the drive and installed another clean version of XP. This seemed fine, and 3D graphics were fine. Meanwhile, my Vista installation decided it had had enough, and would no longer boot, citing missing or corrupt boot files. Replacing them one by one worked for each individual file (but there were way too many to make that a practical solution), suggesting that they really were corrupt or missing, but I have no idea what had suddenly caused it. The clean XP installation remained there intact, but increasingly (to more than half the time) locked up during the boot process (towards the end of the Windows logo screen), something which was not helped by either VGA mode or Safe mode.
I now figured it must be a coincidental hardware problem, since many different installation and restorations were all suffering ill effects at the same time, after a year of trouble-free usage right from the first build (and that inherited the XP installation from a previous build without any trouble). I'd already tried pulling out one of the graphics cards, and swapping over the existing one, so that was ruled out, and it's not clear that they could really cause system-wide filesystem corruption such as I'd seen anyway. The most obvious candidate was the motherboard, so I bought a different motherboard (Asrock P45XE), restored an XP image via the laptop again, and installed it, and off we went. First login was great; all seemed to be well. Uninstalled the old chipset drivers and utilities, installed the new ones, and seemed good; could even run 3D apps no problem. No filesystem problems mentioned, no apps screaming that they couldn't find files. Thought I'd solved it. Then rebooted, and now the ATi drivers decided they didn't like me, and reliably gave me a 7E stop code and BSOD on boot. Pulled one of the cards out again, to get round that problem for the time being, and was able to login again, but now filesystem problems once again appeared. Restored another image, same filesystem problems appeared.
Additional things I have done in regard to hardware including different hard drives, different data cables, swapping out memory modules, running memtest and Windows memory diagnostics (nothing found), running Prime95 to look for CPU problems on all four cores (nothing found), checking temperatures (all fine now, including drives and motherboard). The only slightly odd thing I noticed was Everest reading voltages from the Winbond chip, gave a very strange reading for the 12v rail(s), fluctuating between 0v and 3v, and never going anywhere near 12; this is a known problem with Everest and some sensors, however, and the BIOS reported the 12v voltage at being in the 12.0-12.4 range consistently, so I don't think this is a genuine problem.
One important test that I wish I had the facilities to carry out in a more comprehensive fashion, may reveal something important. Having restored an XP image to an IDE drive, I was able to test it in one machine at work (swapping the drives out). It's the only machine I can test it on since all other work machines are Dell and don't have PS2 ports and the USB keyboards/mice don't work without the chipsets installed, but can't login to install them (and don't have the driver disk for them anyway). It's a rather old and slow machine and I don't have drivers for it either, so parts of the chipset, unfortunately including the onboard VGA driver, cannot be loaded for testing, but it is notable that when testing in this machine, XP boots fine every time, and none of "[various_names].dll missing" problems for applications associated with the home machine have occurred, in half a dozen or more boots. Similarly, none of the problems with slowdowns or files that appear to be missing (don't show up in Windows Explorer and can't be found by apps which need them) until half an hour or more of slow partial disk activity has gone when they're slowly discovered, as happens in the restored XP image on the home machine. I dearly wish I had access to a more recent or complete machine with all the drivers in which I could test this definitively, but unfortunately I don't and there is simply no way I can go and buy one (I've already spent more than I should on various bits of kit to test this problem!).
--------------------------------------
BRIEF SUMMARY
SOFTWARE
(1) Virus attack triggered the whole thing and required restoration of True Image backup of main XP installation.
(2) Backup restored okay, but graphics failed under stress. Different backup restored, similar problem. Different backup files, using different versions of backup software tested, same result.
(3) Further backups restored through the desktop were unable even to login due to crippling problems. Chkdsk reports filesystem problems, and interestingly, with the same problem file IDs in two different instances.
(4) Further backups restored via external drives connected to a laptop always seem to start okay, apart from the 3D graphics problem. Together with (3) this suggests that the desktop system is also failing under the stress of restoring a large (350GB) image.
(4) First clean XP installation appears okay, but after a while shows the same graphics corruption problem that the restored images showed.
(5) Clean XP installation appears okay, graphics okay to start with, but rapidly develops boot lockup problem, and eventually declared unbootable even though the files are still in place. This may be another example of a filesystem problem appearing for no clear reason.
(6) On the other hand, restoring the clean XP image initially appears to function perfectly well.
(7) Vista installation initially seemed okay, including graphics, but eventually also failed, again with boot file problems.
(8) Possible filesystem problems also developed on non-bootable data drive present while restored XP images were running.
Summary: different OSs on different drives all develop one or both of two types of problems: graphics corruption under stress, and filesystem/resource problems.
HARDWARE
(1) Memtest and Windows memory diagnostic find no problems with memory. Memory modules also swapped for a completely different set, but appears to make no difference.
(2) CPU was overheating and causing system shutdown (due to thermal protection enabled in new BIOS) after motherboard switch, but new CPU cooler seems to have taken care of that. Prime95 torture test shows no problems with any CPU core.
(3) Motherboard changed for a new one from a different brand and with a different chipset. Seems to make no difference to the problem again.
(4) Several different hard drives, all verified as working before, some brand new, all with good S.M.A.R.T status, tested, and same problems occurring on each. Similarly, different cables used. Some drives are SATA, some are IDE, tested on different ports on different motheroboards.
(5) Two graphics cards tested, individually and paired together, both verified as working before and during the problem. Same problems occur with either/both.
So basically, I've changed/tested all the hardware, and it all appears to be fine and yet these somewhat random and progressive errors remain in place, and affecting different OS installations, some of them new, with both graphics corruption under stress and/or filesystem problems. It's not a hardware problem, it's not an image problem, it's not an OS or software problem, and yet it is a problem that seems to affect only my machine. Does it not like the decor in my apartment or something?!!
Any ideas at all? Anyone? I'm completely pulling my hair out over this; in ten years of building my own systems I've never come across anything like it. It seems simply inexplicable, and obviously I can't move forward even with a clean install until I've solved it, because clean installs also appear to fall apart after a short while. Meanwhile, thanks for reading, and apologies for the enormous length of this.
I'm not sure if this is the right forum for this; I can't see a better one but a mod can move it if it's more appropriate somewhere else. I posted this problem on experts exchange as well about a week ago, and it seems to have baffles a couple of people there as well, so I'm hoping you guys can provide more insight. Apologies for the extreme length of this, but the problem is so weird, and I've done a comprehensive set of investigations so far, and I know you guys will want as much info as possible, so here it goes. I'll first of all post my original core system specs so you know what we're talking about, then I'll give you the long history bit of the problem during the last week, but if you want to skip that initially, a summary of the situation is given at the end.
Thermaltake Soprano Black case
OZC GameXStream 600W power supply
MSI Neo2 P35 Crossfire motherboard
Q6600 Quad Core (previously overclocked to @3.2Ghz, but currently being tested at stock 2.4Ghz)
4*2Gb Corair XMS2 DDR2 800Mhz RAM (stock timings now)
2*Sapphire Radeon HD4850 512Mb PCI-E graphics cards.
Creative X-Fi Elite Pro soundcard
Texas Instruments 1394 network card (not actually used)
Samsung 22" 226BW monitor
--------------------------------------
LONG HISTORY
I run 32-bit XP as my main OS, and regularly make backups using Acronis True Image 8 (and for more recent ones, True Image 11, but it's TI8 restores that we're talking about). A few days ago, a spectacular virus attack got through the various defences of my homebuilt system (hardware firewall, software firewall, startup control, AV heuristics; it all fell apart), and left the XP installation unusable. After failing to get it completely under control again, I deleted the partition (after backing it up to an external drive in case I need anything from it in future), and decided to restore a disk image made 3 weeks earlier. The image was restored to the same disk that it originally came from, it was a full disk image restore (including MBR and Track 0) and that appeared to be successful. The restored XP installation was now bootable, and contained all the files that had been there when the image was made. So far so good.
Now the first problem: to my surprise, when trying to play a 3D game, it crashed. Tried another, it also crashed. Some others seemed fine: on investigation, the pattern was basically if 3D was stressed, major graphics corruption occurred which if unstopped would eventually lockup the system. Booting into Vista at the time, 3D graphics appeared to be fine; no crashes or lockups. A clean XP install similarly performed fine, but when I attempted to restore an earlier XP image, to a different hard drive, the same problem occurred again.
If that wasn't weird enough, things got a good deal stranger when I attempted to restore the original image once again (this is now the third image restoration). This time, it got to the Welcome screen fine, but I couldn't even login, with complaints that it couldn't find the profile, and shortly after that, complaints about the filesystem (which I imagine were both from the same underlying cause). I restored the other image, again to a different drive, and had the same problem. So now, for two different images on two different drives, both of them showed one lot of strange behaviour on the first restore (graphics failure under stress) and a different lot of strange behaviour on the second restore (filesystem problems).
At that point I was prepared to put it down to just image problems, although I verified the images in True Image both at the time I made them, and at the time I restored them, and they were fine. I can also browse and open the files within the images without any problems, so I don't believe they're corrupt, but in any case I tried restoring them once more, through my laptop now using an external enclosure, and sure enough, that seemed fine, I could login again! However, the system shortly became unstable, with lots of "missing" files (which weren't actually missing at all), causing many apps to fail. More filesystem problems. Restoring a different image to a different drive gave similar results again; always after restoration via the laptop I can login, but they quickly show these missing file problems.
Putting it down finally to bad images, I turned back to my clean XP installation, which was having the usual Radeon driver fun. Finally found some drivers (9.6) which seemed to accept both cards being present, and what did I find? 3D graphics corruption under stress. Meanwhile Vista on the same machine continued to be fine, including 3D graphics.
Scrapped that, reformatted the drive and installed another clean version of XP. This seemed fine, and 3D graphics were fine. Meanwhile, my Vista installation decided it had had enough, and would no longer boot, citing missing or corrupt boot files. Replacing them one by one worked for each individual file (but there were way too many to make that a practical solution), suggesting that they really were corrupt or missing, but I have no idea what had suddenly caused it. The clean XP installation remained there intact, but increasingly (to more than half the time) locked up during the boot process (towards the end of the Windows logo screen), something which was not helped by either VGA mode or Safe mode.
I now figured it must be a coincidental hardware problem, since many different installation and restorations were all suffering ill effects at the same time, after a year of trouble-free usage right from the first build (and that inherited the XP installation from a previous build without any trouble). I'd already tried pulling out one of the graphics cards, and swapping over the existing one, so that was ruled out, and it's not clear that they could really cause system-wide filesystem corruption such as I'd seen anyway. The most obvious candidate was the motherboard, so I bought a different motherboard (Asrock P45XE), restored an XP image via the laptop again, and installed it, and off we went. First login was great; all seemed to be well. Uninstalled the old chipset drivers and utilities, installed the new ones, and seemed good; could even run 3D apps no problem. No filesystem problems mentioned, no apps screaming that they couldn't find files. Thought I'd solved it. Then rebooted, and now the ATi drivers decided they didn't like me, and reliably gave me a 7E stop code and BSOD on boot. Pulled one of the cards out again, to get round that problem for the time being, and was able to login again, but now filesystem problems once again appeared. Restored another image, same filesystem problems appeared.
Additional things I have done in regard to hardware including different hard drives, different data cables, swapping out memory modules, running memtest and Windows memory diagnostics (nothing found), running Prime95 to look for CPU problems on all four cores (nothing found), checking temperatures (all fine now, including drives and motherboard). The only slightly odd thing I noticed was Everest reading voltages from the Winbond chip, gave a very strange reading for the 12v rail(s), fluctuating between 0v and 3v, and never going anywhere near 12; this is a known problem with Everest and some sensors, however, and the BIOS reported the 12v voltage at being in the 12.0-12.4 range consistently, so I don't think this is a genuine problem.
One important test that I wish I had the facilities to carry out in a more comprehensive fashion, may reveal something important. Having restored an XP image to an IDE drive, I was able to test it in one machine at work (swapping the drives out). It's the only machine I can test it on since all other work machines are Dell and don't have PS2 ports and the USB keyboards/mice don't work without the chipsets installed, but can't login to install them (and don't have the driver disk for them anyway). It's a rather old and slow machine and I don't have drivers for it either, so parts of the chipset, unfortunately including the onboard VGA driver, cannot be loaded for testing, but it is notable that when testing in this machine, XP boots fine every time, and none of "[various_names].dll missing" problems for applications associated with the home machine have occurred, in half a dozen or more boots. Similarly, none of the problems with slowdowns or files that appear to be missing (don't show up in Windows Explorer and can't be found by apps which need them) until half an hour or more of slow partial disk activity has gone when they're slowly discovered, as happens in the restored XP image on the home machine. I dearly wish I had access to a more recent or complete machine with all the drivers in which I could test this definitively, but unfortunately I don't and there is simply no way I can go and buy one (I've already spent more than I should on various bits of kit to test this problem!).
--------------------------------------
BRIEF SUMMARY
SOFTWARE
(1) Virus attack triggered the whole thing and required restoration of True Image backup of main XP installation.
(2) Backup restored okay, but graphics failed under stress. Different backup restored, similar problem. Different backup files, using different versions of backup software tested, same result.
(3) Further backups restored through the desktop were unable even to login due to crippling problems. Chkdsk reports filesystem problems, and interestingly, with the same problem file IDs in two different instances.
(4) Further backups restored via external drives connected to a laptop always seem to start okay, apart from the 3D graphics problem. Together with (3) this suggests that the desktop system is also failing under the stress of restoring a large (350GB) image.
(4) First clean XP installation appears okay, but after a while shows the same graphics corruption problem that the restored images showed.
(5) Clean XP installation appears okay, graphics okay to start with, but rapidly develops boot lockup problem, and eventually declared unbootable even though the files are still in place. This may be another example of a filesystem problem appearing for no clear reason.
(6) On the other hand, restoring the clean XP image initially appears to function perfectly well.
(7) Vista installation initially seemed okay, including graphics, but eventually also failed, again with boot file problems.
(8) Possible filesystem problems also developed on non-bootable data drive present while restored XP images were running.
Summary: different OSs on different drives all develop one or both of two types of problems: graphics corruption under stress, and filesystem/resource problems.
HARDWARE
(1) Memtest and Windows memory diagnostic find no problems with memory. Memory modules also swapped for a completely different set, but appears to make no difference.
(2) CPU was overheating and causing system shutdown (due to thermal protection enabled in new BIOS) after motherboard switch, but new CPU cooler seems to have taken care of that. Prime95 torture test shows no problems with any CPU core.
(3) Motherboard changed for a new one from a different brand and with a different chipset. Seems to make no difference to the problem again.
(4) Several different hard drives, all verified as working before, some brand new, all with good S.M.A.R.T status, tested, and same problems occurring on each. Similarly, different cables used. Some drives are SATA, some are IDE, tested on different ports on different motheroboards.
(5) Two graphics cards tested, individually and paired together, both verified as working before and during the problem. Same problems occur with either/both.
So basically, I've changed/tested all the hardware, and it all appears to be fine and yet these somewhat random and progressive errors remain in place, and affecting different OS installations, some of them new, with both graphics corruption under stress and/or filesystem problems. It's not a hardware problem, it's not an image problem, it's not an OS or software problem, and yet it is a problem that seems to affect only my machine. Does it not like the decor in my apartment or something?!!
Any ideas at all? Anyone? I'm completely pulling my hair out over this; in ten years of building my own systems I've never come across anything like it. It seems simply inexplicable, and obviously I can't move forward even with a clean install until I've solved it, because clean installs also appear to fall apart after a short while. Meanwhile, thanks for reading, and apologies for the enormous length of this.