Random yet Inconsistent Freezing (w/ no event log errors or memory dumps)

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530
[strike]EDIT: Since the start of this thread, a BSOD occured that I'm hoping is behind the cause of the freezing. You can see my BSOD help thread here on TenForums to keep track of this issue / see any possible solutions.[/strike] No solution was found on Ten Forums.

Unsuccessful Fixes:
- No overclock present
- Multiple clean installs of Windows 7 and Windows 10
- SATA mode changed from IDE to ACHI
- Hard Drive health confirmed satisfactory with HDTune
- Temperature monitoring returned nothing unusual
- Set PC Power Plan to High Performance
- Turn Hard Drive Off set to Never
- Memtest run on both sticks of RAM individually and separately in difference slots for multiple successful passes - all results error free
- BIOS is running on most recent version; chipset drivers are running on most recent version; all drivers installed from motherboard website
- Run Activate Garbage Collection on SSD (leave powered but don't boot for 6-8 hours)
- RMA'd the SSD
- Prime95 and Furmark stress testing with no crashes, freezes, artifacts, etc.
- CPU voltages are not unusual; RAM timings, voltages are not unusual
- Changed used SATA ports
- Driver Verifier run for days; Virtual Audio Cable triggered an error but no other drivers did and the issue persisted after removing the software; Driver Verifier ran until a freeze occured and no blue screen was triggered so it is not a driver issue
- Reseated CMOS battery
- CPU replacement fixed the problem

I've had this problem going on and off for about a year now and nothing I've done has fixed it. Uninstalling my DS3 drivers fixed the problem for about a solid week before it came back again. I'm losing my mind; I don't know how to go about fixing this issue.

My computer has been freezing randomly with a looping audio frequency (the last noise that played before it freezes). I've attempted to reinstall Windows (countless times) and it hasn't fixed it. Memtest86+, PSU, GPU, and CPU stress testing doesn't yield any results either. It's completely random and cannot be reproduced.

I took the PC into Canada Computers, as recommended by a friend, once and they stress tested every part and did everything they could. They told me after three days they weren't able to reproduce the problem and returned the computer. Within 30 minutes upon arriving home it happened again.

I unplugged every peripheral (except my PS3 controller because I had a hunch) and it happened again the following day. I uninstalled my PS3 drivers and lo and behold it stopped. For now. I've recently moved to residence for school and none of the original peripherals made the trip with me. Same issue; still freezes.

Now it freezes at least once a week but I can't reproduce it I just have it wait. There are no error messages (or any messages of any kind) in Event Viewer around the time of the freezing. It never blue screens either - it will just hang until you turn off the computer via unplugging it.

My specs can be found in my signature.

Any and all help will be appreciated. Thank you!
 
Solution
Sigh. Yeah, doesn't seem like a driver. And no obvious hardware we haven't either tested or replaced. Were you doing anything specific when it froze or it doesn't matter? I don't know what to suggest, you have had it tested at Canada Computers,

its not ssd as its new
its not ram
did you run Prime95 on CPU?

Clock_Watchdog_Timeout could be the clue we need, it might be the CPU

We running out of parts to test that are easy to test. I said that already, Motherboard and CPU & PSU hardest things to check as the best way is to use a spare and see if simply replacing it removes error. Your error so inconsistent that a store could test it and not show cause if they didn't keep it long enough.

Tenforums didn't reach a conclusion either I expect?

Colif

Win 11 Master
Moderator
no bsod, no memory dump, no events... this all points at it being hardware still.

I would have suggested memtest but you have run that.

Try setting PC power plan to high performance, choose change plan settings and under advanced power settings, set hdd to never turn off - maybe psu doesn't support all the new power options in win 10

Try running HDTune on the hard drive.

Do you have the latest bios?
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


Ran memtest again last night as a precaution; four passes with zero errors.

Power Options is already on High Performance with Hard Drive set to turn off Never.

HDTune reports the Reallocated Event Count (sector replacement operations) for my SSD is abnormal (16 counts), error checking on SSD and HDD both returned no damaged blocks.

BIOS is up to date
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


Storage Executive also reports 16 instances of reallocated event counts but otherwise reports it is in good health. There was a firmware update for the SSD so I applied that as well.
 

Colif

Win 11 Master
Moderator
Reallocation Event Count S.M.A.R.T. parameter indicates a count of remap operations (transferring data from a bad sector to a special reserved disk area - spare area).

The raw value of this attribute shows the total number of attempts to transfer data from reallocated sectors to a spare area. Unsuccessful attempts are counted as well as successful. Since this is a count value, it can only increase.

Recommendations

This is a critical parameter. Degradation of this parameter may indicate imminent drive failure. Urgent data backup and hardware replacement is recommended.

https://kb.acronis.com/content/9132

how old ssd?
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


The PC was built in February of 2015, so a year and a half old
 

Colif

Win 11 Master
Moderator
if freezes weren't once a week, i would ignore it and just watch the smart scores, but they happen too often to ignore

You have 3 year warranty: http://www.crucial.com/usa/en/company-warranty - link also shows rma web site

I don't know how long turn around is on replacements. If it were too long I would just get a new SSD and then rma old one. See what they say as they may say the score isn't too bad.
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


I reopened HDTune to see whether or not the number of reallocation events had increased, and it now reads 0 from both there and Storage Executive. Was 16 just a false reading then?
 

Colif

Win 11 Master
Moderator
The Crucial Storage Executive tool will correctly report the SMART data on all supported Crucial SSD models. On storage drive models that are not supported by the Crucial Storage Executive, the tool will still report the SMART data. However, the attribute definitions will only be displayed for SSD models supported by the tool.

The second thing to remember is that SMART data is not a diagnostic tool. In fact, SMART data by itself is not an indicator of the general health and status of an SSD. Standard troubleshooting practices are far more reliable when it comes to determining the health status and reliability of an SSD, than any SMART data read-out.

Incorrectly reported or interpreted SMART data inevitably leads to incorrect conclusions which, if you’re unlucky, can lead to an RMA of a perfectly functional drive. Interpreted correctly on the other hand, the SMART data from your SSD can in some cases be a useful tool in troubleshooting your SSD. The important thing is to remember that those cases are fairly rare. SMART is one tool among many, and as with all tools, it works best when it is used the way it was intended.

http://forum.crucial.com/t5/Crucial-SSDs/SSDs-and-SMART-data/ta-p/147014

On the one hand, they say their tool accurately records the smart and on the other they say its not a reliable diagnostic tool. I am not sure what we should use as most ssd tests either test speed (http://www.techspot.com/downloads/6014-as-ssd-benchmark.html) or just show the smart score.

It seems odd to me a register would reset itself on both tests. Its your drive, all I can suggest is SSD aren't like hdd, they don't make noises or things like that to show they are going bad, they can just stop. I would contact crucial and ask them what you should do.
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


I contacted Crucial and they recommended I do Active Garbage Collection (unplug the SATA cable, leave the power cable plugged in to the SSD, and leave the computer on for 6-8 hours). I've also been looking at other forum posts and have made some other adjustments: the drive now uses ACHI instead of IDE and Power Options have been set to High Performance > Hard Drive turns off: Never.

Since it only freezes once or twice a week, I won't know for certain if the problem is gone after at least two weeks. I'll let you know what happens (and mark your latest answer as Best Answer if it works again too!). Thanks for your time Colif!
 

Colif

Win 11 Master
Moderator
You were running it in IDE mode? Um... who made this PC? I guess Canada Computers just checked hardware, I guess I didn't even think about it as I saw motherboard was fairly new and didn't even think to ask. I need to make a checklist up for SSD problems I guess.
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530
So unfortunately I had the problem again tonight - I was hoping that it was finally resolved when it didn't freeze over the weekend (Saturday was one week between freezes)

It did have two or three instances where it froze for a very short amount of time with the usual symptoms - random, inconsistent, audio loops - but it actually resumed activity after about two or three seconds. The freeze I had tonight was the same as the other ones however.

HDTune still reports zero reallocated events as well. Do you have any suggestions? I plan on contacting Crucial sometime tomorrow again.

EDIT: I contacted Crucial and they said I should exchange the SSD for a new one and hopefully the issue will be fixed with the new one. Thanks for your help!
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


So I finally got my new SSD from Crucial on Monday and set everything back up again. 15 minutes ago, it unfortunately froze again. Identical to the old ones too. Do you have an idea what my next step should be?
 

Colif

Win 11 Master
Moderator


hunt me down and kill me? Sorry, I hate getting it wrong when I suggest parts :(

I don't think its software based on no errors being reported.

Its not ssd, not memory, did I ever ask what type of PSU you have?
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


Seasonic G-Series 80+ Gold PSU 650W (+- 50W; I'm not 100% sure the wattage)
 

Colif

Win 11 Master
Moderator
Give me my power supply back... that is what I have so I cannot cast doubt on it.

your PC has no obvious weaknesses and isn't being helpful in that regard. I will ask for other opinions and maybe someone can figure out what I am missing.
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


Bit of an update: early Wednesday morning I had a IQRL_NOT_LESS_THAN_OR_EQUAL BSOD, and late last night I had a CLOCK_WATCHDOG_TIMEOUT BSOD as well. I haven't been able to reproduce either error, and in both cases the computer freezes as normal but then after 5-10 seconds will show the blue screen error. I can upload the .dmp files later this afternoon if that would be helpful. Does this possibly narrow down the problem?
 

Colif

Win 11 Master
Moderator
download and run who crashed: http://www.resplendence.com/whocrashed
it will look at those dumps and show us a summary
might show us a driver name against the IRQ error, not expecting much from DPC watchdog error.

copy/paste results in here

CLOCK_WATCHDOG_TIMEOUT is generally considered as a hardware BSOD. Although rare, some faulty drivers can also cause this error.

Here is the MSDN technical article about the Bug check 101

http://msdn.microsoft.com/en-us/library/windows/hardware/ff557211(v=vs.85).aspx

This stop code is a result of CPU became unresponsive and remains in a deadlock situation. That is CPU is not responding to any interrupts. About 80 % of the time, this issue is caused by a faulty core inside the CPU. On some occasions a faulty device driver can also make the CPU unresponsive and lead to this BSOD.

http://www.bleepingcomputer.com/forums/t/499856/how-to-fix-my-clock-watchdog-timeout-error/

might want to try running http://www.guru3d.com/files-details/prime95-download.html and test CPU

We were running out of hardware to test, CPU & Motherboard being hardest to test so its why they left to the end.
 

DaPayneTrain

Commendable
Jun 18, 2016
31
0
1,530


The WhoCrashed logs are available here:

On Sat 10/29/2016 12:15:52 AM your computer crashed
crash dump file: C:\Windows\Minidump\102916-8343-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x14A3B0)
Bugcheck code: 0x101 (0x18, 0x0, 0xFFFFA680A50AC180, 0x3)
Error: CLOCK_WATCHDOG_TIMEOUT
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This indicates that an expected clock interrupt on a secondary processor, in a multi-processor system, was not received within the allocated interval.
This appears to be a typical software driver bug and is not likely to be caused by a hardware problem. This problem might also be caused because of overheating (thermal issue).
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.



On Sat 10/29/2016 12:15:52 AM your computer crashed
crash dump file: C:\Windows\memory.dmp
This was probably caused by the following module: hal.dll (hal!HalPerformEndOfInterrupt+0xC6)
Bugcheck code: 0x101 (0x18, 0x0, 0xFFFFA680A50AC180, 0x3)
Error: CLOCK_WATCHDOG_TIMEOUT
file path: C:\Windows\system32\hal.dll
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: Hardware Abstraction Layer DLL
Bug check description: This indicates that an expected clock interrupt on a secondary processor, in a multi-processor system, was not received within the allocated interval.
This appears to be a typical software driver bug and is not likely to be caused by a hardware problem. This problem might also be caused because of overheating (thermal issue).
The crash took place in a standard Microsoft module. Your system configuration may be incorrect. Possibly this problem is caused by another driver on your system that cannot be identified at this time.



On Wed 10/26/2016 1:54:03 AM your computer crashed
crash dump file: C:\Windows\Minidump\102616-9031-01.dmp
This was probably caused by the following module: ntoskrnl.exe (nt+0x14A2C0)
Bugcheck code: 0xA (0xFFFFC70E1F658190, 0xFF, 0x0, 0xFFFFF8001D7F8C38)
Error: IRQL_NOT_LESS_OR_EQUAL
file path: C:\Windows\system32\ntoskrnl.exe
product: Microsoft® Windows® Operating System
company: Microsoft Corporation
description: NT Kernel & System
Bug check description: This indicates that Microsoft Windows or a kernel-mode driver accessed paged memory at DISPATCH_LEVEL or above.
This appears to be a typical software driver bug and is not likely to be caused by a hardware problem.
The crash took place in the Windows kernel. Possibly this problem is caused by another driver that cannot be identified at this time.

I'm assuming to use default settings for Torture Testing for Prime95 so I've started that now.

EDIT: relocated the WhoCrashed logs from PasteBin to a spoiler.
 

Colif

Win 11 Master
Moderator
If Prime95 finds nothing wrong with the CPU, I would suggest you follow the posting instructions here and post a question about your BSOD and main problem on this forum as they likely to answer the bsod questions better than I can: http://www.tenforums.com/bsod-crashes-debugging/ - If i knew how to read dump files I would help you with that, or if I thought someone who could read them was around, I would not get you to go to another site to figure this out. Refer them to this thread to avoid doubling up on ideas :)

The whocrashed reports just finger windows. It doesn't offer any clue as to what driver caused the Wednesday error
NTOSKRNL is the brains of windows 10, one of its roles is to deal with driver requests
HAL.DLL is a system process that sits between windows and hardware

CLOCK_WATCHDOG_TIMEOUT was caused by a request being sent from kernel to hardware, I don't know if it was the request or the CPU that caused the crash though.

The BSOD was what we needed as a clue as to what was wrong. Fixing them should help fix the main problem.