PC randomly doesn't boot after sleep + no POST + BSOD

G

Guest

Guest
I am unsure which forum category would be the best, please feel free to delete the one which doesn't belong!

I'm having a series of problems I'm just losing my head over.

1) First off, I never had a single stability problem while the PC is running, in fact it's quite new and I've been using it without any issues. However, as soon as I put it to sleep there will be a chance that it will simply not wake up next time, regardless of sleep duration, forcing me to turn it off and on again.

2) When this happens, it will take some random attempt at rebooting it because sometimes the POST will just not happen, and I've tried every combination of button press/USB unplugging/power cut.

3) When it decides to boot, it sometimes will fire a Plug & Play BSOD during the Windows startup, and the randomness start again. Again, tried any combination of reseating/unplugging/changing any USB port and devices/etc.

4) When it finally decides to boot Windows, I browse through my minidumps and see that while I had lots of BSODs, Windows decided to just generate only one dump, i.e. only the last BSOD that happened got dumped as if the others never existed. And even after multiple events, I simply can't piece their info together to find the culprit. The only thing they have in common (using BlueScreenView to parse through them) is the following faulty module stamp:

Code:
ntoskrnl.exe	ntoskrnl.exe+175af0	fffff800`04aXXXXX	fffff800`0XXXXXXX
0x005ec000	0x5609efa0	29/09/2015 03:55:44	Microsoft® Windows® Operating System
NT Kernel & System	6.1.7601.19018 (win7sp1_gdr.150928-1507)	Microsoft
Corporation	C:\Windows\system32\ntoskrnl.exe

With the only difference being the address,

Code:
fffff800`04aXXXXX	fffff800`0XXXXXXX

The other faulty modules are always different; rdpencdd.sys, nvhda64v.sys, usbhub.sys, CompositeBus.sys, raspppoe.sys, ftusbload2.sys, HWiNFO64A.SYS, vpcvmm.sys....

(I can upload all the minidumps if needed.)

After that, I get everything working as usual as if nothing ever happened; the PSU voltages seem to be fine, the motherboard/GPU/CPU seem to work correctly, I can play games, watch videos, develop apps, fill 16 gb of RAM no problem...

What is this? A software issue? A hardware issue? A faulty Windows installation? A corruption in the PNP subsystem? I'm simply losing my head!

My specs are as following:

- Windows 7 Professional SP1 (I have some updates due)
- Corsair CX500M 500 Watt PSU
- Gigabyte 990FXA-UD3
- Zotac GeForce GTX 970
- AMD FX-8350 4.00 GHz
- Corsair Vengeance 16-GB kit RAM
- Cooler Master Hyper 212 EVO
- An old 300 GB Seagate ST1000DM003-1SB10C
- A newer 7200 RPM 1TB Seagate Barracuda ST3320418AS
- An old VGA Acer AL1512 15"
- A newer DVI Samsung S24B420BW 24"
- Various USB devices
 
Solution
note: often just loading the memory dump into the windows debugger, the debugger will just tell you that it can not find symbols for a particular file. This is a big hint on what the debugger thinks is part of the problem. First the debugger only tells you that error if it is a third party driver that Microsoft does not supply or is a windows file that has been modified. I think that is why i focused on that old driver without having to look a kernel memory dump to view the USB info and internal error logs.


I use the windows debugger. The automated tools just tell you where something broke. The problem is all drivers talk to the windows kernel so the break will either be in the driver code or the code that it is calling (the...
-I would check the BIOS and make sure it does not have a setting to power down (sleep) your USB ports.
-I would also go into windows control panel and find the USB devices and look at the power management tab and make sure they can not be powered down.
- iwould download usbview.exe to see if the usb ports are reporting a error or problem with a device
- If i had any type of special USB charger driver for a Apple device I would remove it.

any error from a USB device could cause this problem, make sure the drivers are up to date.

you might also want to put a memory dump on a server and post a link.
files are located c:\windows\minidump directory
(generally USB related problem require a kernel memory dump to debug rather than a minidump)

here is a link to the compiled version of usbview (at the bottom of the page)
http://www.uwe-sieber.de/usbtreeview_e.html
 
G

Guest

Guest
Thanks for the input, I've downloaded the program but I can't see any error being reported with my USB devices, and if there is any I don't know how to look for it. It just displays a list of the attached devices and their properties, but I don't know where it logs errors.
Any element in the Device Manager that displayed the Power Management tab has the option either off or greyed out and the USB selective suspension is disabled. I'll also look into the BIOS, though I think I've already made sure it was disabled as well previously. The only devices I have attached are a PS3 controller through a USB extension cable, a Wacom tablet, a wireless keyboard and a wired mouse, but as I said, any combination of them at startup, even with all of them disconnected, still cause the PNP blue screen.

Here's an upload of my relevant minidumps, I'll upload the KMD as well once it happens again:
ftp://banderi.dynu.com/Minidump.rar
 
could not download memory dump, check to make sure it is marked for public access.



 
remove ftusbload2.sys Timestamp: Thu Jan 05 04:28:58 2012
or update the file. FabulaTech USB over Network Server

I assume this is a older machine, you do want to make sure your BIOS is updated. USB technical specs got changed over the years and a old bios may not work with generic USB drivers. Your generic microsoft USB drivers will assume you have a BIOS that conforms to the USB 2.2 spec changes that were finalized some time in 2012. (see if your bios is dated 2012 or newer)

Same goes for your USB 3 ports, you want a BIOS sometime after 3/2013 for proper USB 3. functions.
if you plug a old device like your usb server in to a usb 3 port it man not function correctly, you could try the USB 2. port


------------
looking at the bugchecks, most current first.

first bugcheck caused by windows plug and play attempting to load a usb device driver
ftusbload2.sys FABULATECH USB OVER NETWORK SERVER

looks like the plug and play did not like the driver because of a bug in the driver.,
(it is complaining that some field in the data structure is not initialized properly)
This caused the micorsoft USB hub to crash and bugchecked the system.
The stack was corrupted.

here is the info on the bad driver:
Loaded symbol image file: ftusbload2.sys
Image path: ftusbload2.sys
Image name: ftusbload2.sys
Timestamp: Thu Jan 05 04:28:58 2012 (4F05978A)

also, unable to read the BIOS info, this could mean that the BIOS is really old and may need to be updated.
- you have various old drivers installed.
dtsoftbus01.sys from 2011 Daemon Tools driver known to cause bugchecks, you should update this driver.
- ScpVBus.sys Sun May 05 14:31:26 2013 scarlet crush driver this version will also cause memory corruption.
 
G

Guest

Guest
I see, I'll uninstall the remnants of that program and see if it solved, thanks. Question: how did you manage to get that info from the minidumps? Is it something you can get out of simple utilities like BlueScreenView or do you need to reverse engineer the files in more complex manners? I'd like to ask so that I could do it on my own next time something like this happens.
 
note: often just loading the memory dump into the windows debugger, the debugger will just tell you that it can not find symbols for a particular file. This is a big hint on what the debugger thinks is part of the problem. First the debugger only tells you that error if it is a third party driver that Microsoft does not supply or is a windows file that has been modified. I think that is why i focused on that old driver without having to look a kernel memory dump to view the USB info and internal error logs.


I use the windows debugger. The automated tools just tell you where something broke. The problem is all drivers talk to the windows kernel so the break will either be in the driver code or the code that it is calling (the windows kernel) The automated tools are only useful when something breaks in a 3rd party drivers code.

to actually see what was wrong with the driver you would have to look at a kernel memory dump. It contains the info and internal error logs for the plug and play system and the USB subsystem. (other wise you end up just making assumptions based on other cases where you have debugged similar problems. For example, if i remember correctly ScpVBus.sys driver corrupts memory because of a failure in the driver to allocate a pooltag. They also have bugs in the uninstaller and often do not uninstall and people often have to use the PNPUTIL.exe to remove the driver package before people are really are free of the driver.



 
Solution

banderi

Commendable
Jun 4, 2016
8
0
1,510
I got no PNP error this time, but the computer still won't turn on, so the problem wasn't caused by the USB driver. How can I solve that?

PS. Forgot to add, the BIOS is dated 15/07/2013. As you can see from the specs, it's a moderately new system actually. I'll update it to the latest revision available though, maybe it will help (28/05/2015)
 
the BIOS has a bunch of power management features that it sets up. on the better motherboards it will check voltages for proper values and the motherboard logic may prevent the system from booting if it thinks some voltage is incorrect.

I would update the BIOS and see if that helps.

The better power supplies also has some logic to tell the motherboard that power is ok.

also, windows 7 has many of the power functions turned off by default, windows 8 and above have them turned on by default. I am seeing more and more drivers updates for windows 7 turning on these power management functions as the drivers are being updated. This is exposing the various bugs in the hardware and its drivers. You end up having to set the system to run in high performance mode, or figure out what driver/ hardware is having sleep issues and turn off its power management functions in device manager. Or update the BIOS and hardware drivers so the system has a chance of working correctly,



 

banderi

Commendable
Jun 4, 2016
8
0
1,510
It's already set in high performance mode and have turned off a bunch of power management options already, that's how I solved the initial sleep bug-by-duration months ago, now I get this random problem.
I've updated the BIOS to the F3 version, we'll see if that works by magic. I can't seem to be able to find a newer version for any of my drivers, as well (ftusbload2.sys was also the latest). In case the update doesn't cut it, what else could I try to do?
 
well, I initial your ftusbload2.sys was a USB driver so it depend on a proper USB subsystem and BIOS support
but it is also a network provider which would depend proper network support by NDIS and the network driver on the machine. the whole path that the data flows thru has to work correctly for he driver to work correctly. Also, other USB drivers can act on packets that they don't own and prevent the proper driver from getting its software packet.
The driver is supposed to look at the packet, see if it is for them and pass it to the next driver in the chain.
some older drivers might falsely claim the packet and delete it. In these cases the order of the loading of the drivers determines if the system works. Really a pain to figure out.

You might be able to change the memory dump type to kernel, and run cmd.exe as an admin and then run
verifier.exe /standard /all
and reboot. This will force windows to do more error checking on the drivers and the kernel dump will save the internal logs, plug and play info and USB info into the memory dump.

if verifier.exe finds a bad driver it will call a bugcheck and write the info to the memory dump.
Please note, be sure you know how to get into safe mode because you might get a bugcheck at system boot with some drivers.
also note: you have to turn verifier off when you are done testing or your machine will run slowly until you do.
use
verifier.exe /reset
to turn of the extra verification.


it is also pretty strange that I could not read your bios info, you might want to reset it to defaults and reconfigure.
You might also want to check your memory dump settings, or change them to be a kernel memory dump.


you will also want to make sure you have all the windows 7 sp updates installed, The usb specs for usb 2.x were updated and so were the USB 3 specs. Mixing drivers running on usb 2.2 specs with early windows builds will expose the bugs in the early builds.
(it is one of the reasons you can not longer run windows 7 RTM and a modern graphic driver. The modern drivers just no longer work around the bugs in the early windows 7 builds)



 

banderi

Commendable
Jun 4, 2016
8
0
1,510
I thank you for your in-depth response, but I'm really just interested in the booting issue right now rather than the BSOD, which is the thing I just can't get my head around; especially since if the PNP error was solved by uninstalling that driver (meaning in that case they could be two separate issues), I don't have any BSOD to dump anymore, unless that command will force Windows to dump anyway. The computer goes to sleep (manually) and it refuses to boot up afterwards, until after some attempt it decides to do the POST, but I don't know if the hardware is defective or something else is causing an error in the mobo/PSU. Are defective drivers able to corrupt the hardware (i.e. static charge)?

And what do you mean by couldn't read the BIOS info, were they absent in the minidumps?
I've never changed anything in the BIOS, it's all at default state, also just updated it, so maybe it got reset again anyway.
 
the debugger could not understand the format of the BIOS data in the memory dump. Generally this means the data is corrupted or you have a very old or custom BIOS that uses a non standard format.

some times with certain memory dumps the data does not have time to be written to disk before the power goes out.
I guess you could have a system that has a cached memory storage to the hard drive that is not flushing the data from the cache memory to the disk. But it would be strange to only be missing a small section of the BIOS info.
More likely a old non standard format is being used.




 

banderi

Commendable
Jun 4, 2016
8
0
1,510
Hmm, that's very weird. I've updated to the F3 ver. correctly, so if it ever happens again I'll post a kernel dump and maybe it won't be missing.
I've also ran a SFC and CHKDSK to be sure, and they both found some errors;
I have absolutely no idea as to if the SFC log is useful or meaningless in regards to the issues I'm having, but I'll attach it if you're ever interested:

ftp://banderi.dynu.com/CBS.zip

This is also my chkdsk log:

Checking file system on C:
The type of the file system is NTFS.
Volume label is Sirius.

A disk check has been scheduled.
Windows will now check the disk.

CHKDSK is verifying files (stage 1 of 5)...
781568 file records processed.
File verification completed.
2640 large file records processed.
0 bad file records processed.
2 EA records processed.
84 reparse records processed.
CHKDSK is verifying indexes (stage 2 of 5)...
1015156 index entries processed.
Index verification completed.
0 unindexed files scanned.
0 unindexed files recovered.
CHKDSK is verifying security descriptors (stage 3 of 5)...
The security data stream entry at offset 0x3d38550 with length 0xf31e188d
crosses the page boundary.
The security data stream entry at offset 0x3d38550 with length 0x416f6747
crosses the page boundary.
Repairing the security file record segment.
Deleting an index entry with Id 22787 from index $SII of file 9.
Deleting an index entry with Id 22788 from index $SII of file 9.
Deleting an index entry with Id 22789 from index $SII of file 9.
Deleting an index entry with Id 22790 from index $SII of file 9.
Deleting an index entry with Id 22791 from index $SII of file 9.
Deleting an index entry with Id 22792 from index $SII of file 9.
Deleting an index entry with Id 22793 from index $SII of file 9.
Deleting an index entry with Id 22794 from index $SII of file 9.
Deleting an index entry with Id 22795 from index $SII of file 9.
Deleting an index entry with Id 22794 from index $SDH of file 9.
Deleting an index entry with Id 22792 from index $SDH of file 9.
Deleting an index entry with Id 22789 from index $SDH of file 9.
Deleting an index entry with Id 22787 from index $SDH of file 9.
Deleting an index entry with Id 22790 from index $SDH of file 9.
Deleting an index entry with Id 22793 from index $SDH of file 9.
Deleting an index entry with Id 22795 from index $SDH of file 9.
Deleting an index entry with Id 22791 from index $SDH of file 9.
Deleting an index entry with Id 22788 from index $SDH of file 9.
Replacing invalid security id with default security id for file 5420.
Replacing invalid security id with default security id for file 8555.
Replacing invalid security id with default security id for file 502243.
Replacing invalid security id with default security id for file 595327.
Replacing invalid security id with default security id for file 777644.
Replacing invalid security id with default security id for file 777647.
781568 file SDs/SIDs processed.
CHKDSK is compacting the security descriptor stream
Cleaning up 2860 unused security descriptors.
116795 data files processed.
CHKDSK is verifying Usn Journal...
36729168 USN bytes processed.
Usn Journal verification completed.
CHKDSK is verifying file data (stage 4 of 5)...
781552 files processed.
File data verification completed.
CHKDSK is verifying free space (stage 5 of 5)...
25564066 free clusters processed.
Free space verification is complete.
CHKDSK discovered free space marked as allocated in the
master file table (MFT) bitmap.
Correcting errors in the Volume Bitmap.
Windows has made corrections to the file system.

312568640 KB total disk space.
209082660 KB in 661957 files.
334732 KB in 116798 indexes.
0 KB in bad sectors.
894980 KB in use by the system.
65536 KB occupied by the log file.
102256268 KB available on disk.

4096 bytes in each allocation unit.
78142160 total allocation units on disk.
25564067 allocation units available on disk.

Internal Info:
00 ed 0b 00 80 e1 0b 00 94 17 15 00 00 00 00 00 ................
56 07 00 00 54 00 00 00 00 00 00 00 00 00 00 00 V...T...........
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

Windows has finished checking your disk.
Please wait while your computer restarts.

I'll wait and see if one of these did the trick, I'll post back in case it didn't.
In case the problem persists (meaning the booting issue, not the PNP bsod), do you have anything else in mind that could be worth a try?
 

banderi

Commendable
Jun 4, 2016
8
0
1,510
Ok, so I might have found something. The CPU fan is not working correctly after waking up, the speed is cut in half even though the PWM is trying to push 100%. I set the PWM slope to max in the bios and the control to manual, but no dice - even overriding the PWM control does not change the actual speed, it's always half of what it should be! Help, please?