Elusive problem causing BSOD 0x7A, 0xF4, 0xED

Shadow351

Distinguished
Oct 2, 2011
13
0
18,510
Hello I am working on my Dad's Computer, which I built in December.
First the Specs:
Intel Core i5 4th Gen 4670k
MSI Z87-G41 PC MATE 1150 ATX
1x 8GB Patriot Viper Xtreme DDR3 1600MHz
120GB Kingston SV300S3 SSD (Boot device)
2x 1TB WD Green Drives (Add. Storage)
1TB WD Black (Add. Storage)
LITE-ON DH-4B1S Blu-ray Burner
Galaxy Nividia 9800GT
Corsair CX600
Windows 7 x64 Pro

It is throwing BSOD's seemingly randomly Including 0x7A kernel_data_inpage_error, 0xF4 critical_object_termination, and 0xED UNMOUNTABLE_BOOT_VOLUME all of which would point to the SSD being the issue

After the first 2 BSOD's (0x7A,0xF4 occured about a day apart) I rebooted the computer and got the 0xED stop and then Windows wouldn't boot it would just show this error
Status : 0xc000000f
Info: The boot selection failed because a required device is inaccessible.

So I booted into my Ubuntu 12.04 Live CD and ran Disk Utility and it returned these SMART results (the reallocated sector count caught my eye)
tvMCcpD.png
and here is the SMART data:
ubuntu@ubuntu:~$ sudo smartctl -data -a /dev/sda1
smartctl 5.41 2011-06-09 r3365 [x86_64-linux-3.8.0-29-generic] (local build)
Copyright (C) 2002-11 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Device Model: KINGSTON SV300S37A120G
Serial Number: -REMOVED-
LU WWN Device Id: -REMOVED-
Firmware Version: 506ABBF0
User Capacity: 120,034,123,776 bytes [120 GB]
Sector Size: 512 bytes logical/physical
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: 8
ATA Standard is: ACS-2 revision 3
Local Time is: Fri May 9 01:17:10 2014 UTC
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
Error SMART Status command failed: Input/output error
SMART overall-health self-assessment test result: PASSED
Warning: This result is based on an Attribute check.

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 0) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 48) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x0021) SCT Status supported.
SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 120 120 050 Pre-fail Always - 0
5 Reallocated_Sector_Ct 0x0033 100 100 003 Pre-fail Always - 2
9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 45286135175790
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 29
171 Unknown_Attribute 0x0032 000 000 000 Old_age Always - 1
172 Unknown_Attribute 0x0032 000 000 000 Old_age Always - 0
174 Unknown_Attribute 0x0030 000 000 000 Old_age Offline - 24
177 Wear_Leveling_Count 0x0000 000 000 000 Old_age Offline - 3
181 Program_Fail_Cnt_Total 0x0032 000 000 000 Old_age Always - 1
182 Erase_Fail_Count_Total 0x0032 000 000 000 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
189 High_Fly_Writes 0x0000 034 063 000 Old_age Offline - 73018572834
194 Temperature_Celsius 0x0022 034 063 000 Old_age Always - 34 (Min/Max 17/63)
195 Hardware_ECC_Recovered 0x001c 100 100 000 Old_age Offline - 0
196 Reallocated_Event_Count 0x0033 100 100 003 Pre-fail Always - 2
201 Soft_Read_Error_Rate 0x001c 100 100 000 Old_age Offline - 0
204 Soft_ECC_Correction 0x001c 100 100 000 Old_age Offline - 0
230 Head_Amplitude 0x0013 100 100 000 Pre-fail Always - 100
231 Temperature_Celsius 0x0013 100 100 010 Pre-fail Always - 0
233 Media_Wearout_Indicator 0x0000 000 000 000 Old_age Offline - 3433
234 Unknown_Attribute 0x0032 000 000 000 Old_age Always - 1717
241 Total_LBAs_Written 0x0032 000 000 000 Old_age Always - 1717
242 Total_LBAs_Read 0x0032 000 000 000 Old_age Always - 1699

SMART Error Log not supported
SMART Self-test Log not supported
Note: selective self-test log revision number (0) not 1 implies that no selective self-test has ever been run
SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
I don't know exacly what this all means but the only thing that looks bad is the realloc sector count.

I ran Windows startup repair (amazingly it worked and Windows successfully booted) once in Windows I ran both seatools (all drives pass) and Kingston's toolbox diagnostic and kingston's toolbox says the SSD has 100% Life remaining)
5lTmN2t.png
and the kingston SMART log:
SMART READ DATA
Revision: 10
Attributes List
1: (SSD Raw Read Error Rate) Normalized Rate: 104 Sectors Read: 9134101 Read Errors: 0
5: (SSD Retired Block Count) Spare blocks remaining 100% Retired Block 2
9: (SSD Power-On Hours) Value 93 Total 6765 hrs 59 mins
12: (SSD Power Cycle Count) Power Cycle Life Remaining 100% Number of power cycles 32
171: (SSD Program Fail Count) Program Error Count 1
172: (SSD Erase Fail Count) Erase Error Count 0
174: (SSD Unexpected power loss count) Unexpected power loss Count 27
177: (Wear Range Delta) Wear Range Delta 3%
181: (Program Fail Count) Program Error Count 1
182: (Erase Fail Count) Erase Error Count 0
187: (SSD Reported Uncorrectable Errors) Normalized Value 100 lifetime URAISE Errors 0
189: (Unrecognized Attribute) Value: 32 Raw Data: 20 00 3f 00 11 00 00
194: (SSD Temperature Monitoring) Normalized temp 32 Current 32 High 63 Low 17
195: (SSD ECC On-the-fly Count) Normalized Value 120 Sectors Read 9134101 UECC Count 0
196: (SSD Reallocation Event Count) Normalized Value 100 Reallocation Event Count 2
201: (SSD Uncorrectable Soft Read Error Rate)Normalized Value 120 Sectors Read 9134101 Uncorrectable Soft Error Count 0
204: (SSD Soft ECC Correction Rate (RAISE) Normalized Value 120 Sectors Read 9134101 Soft ECC Correction Count 0
230: (SSD Life Curve Status) Normalized Value 100
231: (SSD Life Left) Life Remaining 100%
233: (SSD Internal Reserved) 3434
234: (SSD Internal Reserved) 1717
241: (SSD Lifetime writes from host) lifetime writes 1717
242: (SSD Lifetime reads from host) lifetime reads 1704

There doesn't seem to be any crash dump data (I would see this again as pointing to a SSD(HDD) issue) so I checked EventViewer, this event shows HUNDREDS of times in event viewer for the half an hour or so leading up to each of the first 2 crashes
Atapi - Event ID:11 - The driver detected a controller error on \Device\Ide\IdePort0.
. There are no IDE drives in this computer and I couldn't find \Device\Ide\IdePort0 in the regedit device map but maybe this is unrelated?

I finally ran Memtest overnight last night and it showed 2 Errors
Test Pass Failing Address Good Bad Count CPU
7 4 00130512c30-4869.1MB 7fffffff 7bffffff 1 0
7 4 00170512c10-5893.1MB 7fffffff 7bffffff 2 0
I figure this is reason to RMA the RAM (sounds like a rap song lol) but I don't see how this could cause these particular BSOD's, Is it possible or is this also unrelated? Any other thoughts? Thank you in advance.

-Brad
 

plaintuts

Admirable
well for starters..

a lot of things can damage and corrupt data in a SSD or Hard disk, but on the hardware side a failing power supply is the first suspect. caused by inadequate continuous power supplied to the drives.
so, if you hear a wailing sound on the psu then thats it.

second is the motherboard, often times just requiring update to their bios

for the software side, windows update sometimes fixes common to BSOD's problems, but all faults can be traced to just plain daily usage of windows, accumulation of invalid registry, I/O conflict etc.

so the event viewer doesnt really offer us much.

but if you can replicate the scenario on which the BSOD's happen, then we can properly diagnose your system.
 

Shadow351

Distinguished
Oct 2, 2011
13
0
18,510
I pulled a stick of Corsair XMS3 out of my computer and put it in this one to temporarily take care of that issue (until i can contact patriot).

Upon reboot I had to run Windows startup repair again, this time it said it couldn't repair, but it did something because windows booted after that.

I checked in the Bios and the voltages look good here (I don't really have any other way to check the PSU, my computer is a mITX and and its PSU is even questionable but that's a topic for another day) 3.344v, 5.160v, and 12.144v the PSU is quiet, no alarming noises. I don't have a PSU tester and the only other ATX psu I have is only 200w from an old P3 machine.

I updated the MoBo Bios from 1.2 to 1.5

Windows is set to auto update so no updates currently available.

The Bluescreens seem to occur when the PC sits idle for several hours (I don't know if it is when waking up from sleep or not)

I contacted Kingston, currently waiting for a response to see what they say, but from what I read 2 bad sectors is not of concern if it doesn't increase.

We'll see If the issues continue. Next step I think is to reinstall Windows

thanks
 
look that drive you posted the screenshot for is bad its now to me just a storage drive that you hook up to pull a file off of it when you need it and then it stays un hooked till needed for something else its nolonger a primary bootable drive in my book.. I feel as long as your trying to use that drive your not going to go any further with your issue [opinion]
 

Shadow351

Distinguished
Oct 2, 2011
13
0
18,510
Kingston responded they are going to RMA the drive so I will do that and go from there. I connected the drive to my computer and I am trying to retrieve the data off of it (i do have a backup from a couple weeks ago).

Maybe the bad sectors are not the fatal problem because I have the same model drive in my computer, and I just checked, it had 6 bad sectors, but seems to work flawlessly. I'll keep an eye (and backup) on it.
 
that maybe true, butwhat was in them sectors when they went bad?? something important or just some junk or unused space? I look at it if its getting bad sectors something is causing that and it should run for life with out that issue[??] I guess what I read in that link I sent you its a mechanical issue with the drive so why risk it ? ya it may go for 2 more years or it may o for 2 more days.. its just my call what I do about this everyone does what he feels best. see the os is trying to do something and the drive is having some issue and something goes cross and then bsod ... but you can only do as you see fit and I can only say what I think to try to help you out in some way....

A bad sector is a sector on a computer's disk drive or flash memory that cannot be used due to permanent damage (or an OS inability to successfully access it), such as physical damage to the disk surface (or sometimes sectors being stuck in a magnetic or digital state that cannot be reversed[clarify]) or failed flash memory transistors.