Sign in with
Sign up | Sign in
Your question

Seagate ST3120827AS 7200.7 Problem?

Last response: in Storage
Share
Anonymous
a b G Storage
April 6, 2005 5:29:26 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Hi,

I've just bought a brand new Seagate 7200.7 hard disk. It seems to be
working fine but Active SMART reports that the Raw read error rate and
the ECC on the fly count are fluctuating like crazy.

They both started going up from 55 and went upto 62. Now they are back
to 60. Changing every 5 minutes.
Do I have a bad disk? Should I get it replaced immediately?

Or is this normal on Seagate hard disks? I have never used Seagate
HDDs in the past so I don't have much experience with them.

Can anyone please help? I would be very grateful.

Thank you.
If more information is required, please let me know and I will provide
it.


***
....the Phoenix shall rise...
Anonymous
a b G Storage
April 6, 2005 5:29:27 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Previously Phoenix AG <contact.me@in.the.ng> wrote:
> Hi,

> I've just bought a brand new Seagate 7200.7 hard disk. It seems to be
> working fine but Active SMART reports that the Raw read error rate and
> the ECC on the fly count are fluctuating like crazy.

> They both started going up from 55 and went upto 62. Now they are back
> to 60. Changing every 5 minutes.
> Do I have a bad disk? Should I get it replaced immediately?

> Or is this normal on Seagate hard disks? I have never used Seagate
> HDDs in the past so I don't have much experience with them.

AFAIK this behaviour is normal. I have seen similar readings on
Seagate for both attributes and on Samsung for "Hardware_ECC_Recovered".
Whether the actual numbers mean trouble depends on the threshold set
on the disk.

On a ST3120026A it looks like this here:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 055 052 006 Pre-fail Always - 24109901
3 Spin_Up_Time 0x0003 097 096 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 33
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always - 347404439
9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 11140
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 250
194 Temperature_Celsius 0x0022 047 056 000 Old_age Always - 47
195 Hardware_ECC_Recovered 0x001a 055 052 000 Old_age Always - 24109901
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 1
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0

Raw error rate will only be problematic as it approaches 6.
Hardware_ECC_Recovered is an "old-age" attribute, meaning it
does not indicate imminent failure in any case.

One attribute that is critical and should be watched for its
raw value increasing is Reallocated_Sector_Ct, since it indicates
the number of times a sector was bad. Also Temperature_Celsius
is important to make sure the disk will live long. My 47C
reading here is a bit too hot for real comfort, but the disk
non-critical and cooling it wbetter would be difficult.

Arno
Anonymous
a b G Storage
April 6, 2005 6:22:51 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On 6 Apr 2005 01:20:38 GMT, Arno Wagner <me@privacy.net> wrote:

>Previously Phoenix AG <contact.me@in.the.ng> wrote:
>> Hi,
>
>> I've just bought a brand new Seagate 7200.7 hard disk. It seems to be
>> working fine but Active SMART reports that the Raw read error rate and
>> the ECC on the fly count are fluctuating like crazy.
>
>> They both started going up from 55 and went upto 62. Now they are back
>> to 60. Changing every 5 minutes.
>> Do I have a bad disk? Should I get it replaced immediately?
>
>> Or is this normal on Seagate hard disks? I have never used Seagate
>> HDDs in the past so I don't have much experience with them.
>
>AFAIK this behaviour is normal. I have seen similar readings on
>Seagate for both attributes and on Samsung for "Hardware_ECC_Recovered".
>Whether the actual numbers mean trouble depends on the threshold set
>on the disk.
>
>On a ST3120026A it looks like this here:
>
>ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000f 055 052 006 Pre-fail Always - 24109901
> 3 Spin_Up_Time 0x0003 097 096 000 Pre-fail Always - 0
> 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 33
> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
> 7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always - 347404439
> 9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 11140
> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 250
>194 Temperature_Celsius 0x0022 047 056 000 Old_age Always - 47
>195 Hardware_ECC_Recovered 0x001a 055 052 000 Old_age Always - 24109901
>197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
>198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
>199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 1
>200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
>202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
>
>Raw error rate will only be problematic as it approaches 6.
>Hardware_ECC_Recovered is an "old-age" attribute, meaning it
>does not indicate imminent failure in any case.
>
>One attribute that is critical and should be watched for its
>raw value increasing is Reallocated_Sector_Ct, since it indicates
>the number of times a sector was bad. Also Temperature_Celsius
>is important to make sure the disk will live long. My 47C
>reading here is a bit too hot for real comfort, but the disk
>non-critical and cooling it wbetter would be difficult.
>
>Arno

Thank you for the answer :) 

My Reallocated Sector Ct is 100, Worst value is 100, Thresh is 36.
After checking just now with Active SMART, it seems another value has
fluctuated. The Seek Error Rate, which was 100 earlier, has gone down
to 61, Worst value is 60, Thresh is 30. Is this another thing to worry
about?

Which program did you use to generate those numbers? I can't seem to
export a log from Active SMART. Maybe if I got the other program, I
could post my SMART details and you could see them?

Also, as earlier, the Raw read error rate and ECC count has fluctuated
by a value of about 2. It seems to be going up, rather than down.


***
....the Phoenix shall rise...
Related resources
Anonymous
a b G Storage
April 6, 2005 6:22:52 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Previously Phoenix AG <contact.me@in.the.ng> wrote:
> On 6 Apr 2005 01:20:38 GMT, Arno Wagner <me@privacy.net> wrote:

>>Previously Phoenix AG <contact.me@in.the.ng> wrote:
>>> Hi,
>>
>>> I've just bought a brand new Seagate 7200.7 hard disk. It seems to be
>>> working fine but Active SMART reports that the Raw read error rate and
>>> the ECC on the fly count are fluctuating like crazy.
>>
>>> They both started going up from 55 and went upto 62. Now they are back
>>> to 60. Changing every 5 minutes.
>>> Do I have a bad disk? Should I get it replaced immediately?
>>
>>> Or is this normal on Seagate hard disks? I have never used Seagate
>>> HDDs in the past so I don't have much experience with them.
>>
>>AFAIK this behaviour is normal. I have seen similar readings on
>>Seagate for both attributes and on Samsung for "Hardware_ECC_Recovered".
>>Whether the actual numbers mean trouble depends on the threshold set
>>on the disk.
>>
>>On a ST3120026A it looks like this here:
>>
>>ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
>> 1 Raw_Read_Error_Rate 0x000f 055 052 006 Pre-fail Always - 24109901
>> 3 Spin_Up_Time 0x0003 097 096 000 Pre-fail Always - 0
>> 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 33
>> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
>> 7 Seek_Error_Rate 0x000f 085 060 030 Pre-fail Always - 347404439
>> 9 Power_On_Hours 0x0032 088 088 000 Old_age Always - 11140
>> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
>> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 250
>>194 Temperature_Celsius 0x0022 047 056 000 Old_age Always - 47
>>195 Hardware_ECC_Recovered 0x001a 055 052 000 Old_age Always - 24109901
>>197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
>>198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
>>199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 1
>>200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
>>202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
>>
>>Raw error rate will only be problematic as it approaches 6.
>>Hardware_ECC_Recovered is an "old-age" attribute, meaning it
>>does not indicate imminent failure in any case.
>>
>>One attribute that is critical and should be watched for its
>>raw value increasing is Reallocated_Sector_Ct, since it indicates
>>the number of times a sector was bad. Also Temperature_Celsius
>>is important to make sure the disk will live long. My 47C
>>reading here is a bit too hot for real comfort, but the disk
>>non-critical and cooling it wbetter would be difficult.
>>
>>Arno

> Thank you for the answer :) 

> My Reallocated Sector Ct is 100, Worst value is 100, Thresh is 36.

Actually interesting is the raw number. (Last one on the right in
my sample output)

> After checking just now with Active SMART, it seems another value has
> fluctuated. The Seek Error Rate, which was 100 earlier, has gone down
> to 61, Worst value is 60, Thresh is 30. Is this another thing to worry
> about?

Might be. Is your PSU o.k. and is the drive firmly mounted?

> Which program did you use to generate those numbers? I can't seem to
> export a log from Active SMART. Maybe if I got the other program, I
> could post my SMART details and you could see them?

"smartmontools", also ported to Windows. Available here:

http://smartmontools.sourceforge.net/

And the outoput above is not a log, bit the output.

> Also, as earlier, the Raw read error rate and ECC count has fluctuated
> by a value of about 2. It seems to be going up, rather than down.

Well, maybe there is some external factor causing this...
But not necessarily. Could just be variations in usage
pattern.

Arno
Anonymous
a b G Storage
April 6, 2005 11:35:31 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On 6 Apr 2005 11:33:54 GMT, Arno Wagner <me@privacy.net> wrote:


>> Thank you for the answer :) 
>
>> My Reallocated Sector Ct is 100, Worst value is 100, Thresh is 36.
>
>Actually interesting is the raw number. (Last one on the right in
>my sample output)
>
>> After checking just now with Active SMART, it seems another value has
>> fluctuated. The Seek Error Rate, which was 100 earlier, has gone down
>> to 61, Worst value is 60, Thresh is 30. Is this another thing to worry
>> about?
>
>Might be. Is your PSU o.k. and is the drive firmly mounted?

Well, I think so. On a separate note, my computer seems to crash
whenever I run any 3d graphics program on it, like a game or
something. It's just a month old, actually...and I've been trying to
sort out the problem. Have already changed my PSU 3 times, the RAM
twice and the graphics card once. It went away for a while, but now is
back in full force and I can't play any game at all without it
crashing after a half hour.

The drive seems firmly mounted.

Also, Active SMART now gives me a T.E.C. date of May 2005 for the Raw
Read Error rate. This definitely can't be good?

>> Which program did you use to generate those numbers? I can't seem to
>> export a log from Active SMART. Maybe if I got the other program, I
>> could post my SMART details and you could see them?
>
>"smartmontools", also ported to Windows. Available here:
>
>http://smartmontools.sourceforge.net/
>
>And the outoput above is not a log, bit the output.
>
>> Also, as earlier, the Raw read error rate and ECC count has fluctuated
>> by a value of about 2. It seems to be going up, rather than down.
>
>Well, maybe there is some external factor causing this...
>But not necessarily. Could just be variations in usage
>pattern.
>
>Arno
>

Thanks for the reply, once again.

Here is a complete log of the smartmontools output.

=== START OF INFORMATION SECTION ===
Device Model: ST3120827AS
Serial Number: 5MS07R09
Firmware Version: 3.42
User Capacity: 120,034,123,776 bytes
Device is: Not in smartctl database [for details use: -P
showall]
ATA Version is: 6
ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
Local Time is: Wed Apr 06 19:31:48 2005 India Standard Time
SMART support is: Available - device has SMART capability.
SMART support is: Disabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x82) Offline data collection
activity
was completed without error.
Auto Offline Data Collection:
Enabled.
Self-test execution status: ( 0) The previous self-test routine
completed

without error or no self-test
has ever
been run.
Total time to complete Offline
data collection: ( 430) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline
immediate.
Auto Offline data collection
on/off supp
ort.
Suspend Offline collection
upon new
command.
Offline surface scan
supported.
Self-test supported.
No Conveyance Self-test
supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before
entering
power-saving mode.
Supports SMART auto save
timer.
Error logging capability: (0x01) Error logging supported.
No General Purpose Logging
support.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 71) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
UPDATED WHEN_
FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 061 052 006 Pre-fail
Always -
228800583
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail
Always -
0
4 Start_Stop_Count 0x0032 100 100 020 Old_age
Always -
6
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
Always -
0
7 Seek_Error_Rate 0x000f 062 060 030 Pre-fail
Always -
1750450
9 Power_On_Hours 0x0032 100 100 000 Old_age
Always -
27
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
Always -
0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age
Always -
8
194 Temperature_Celsius 0x0022 042 046 000 Old_age
Always -
42 (Lifetime Min/Max 0/26)
195 Hardware_ECC_Recovered 0x001a 061 052 000 Old_age
Always -
228800583
197 Current_Pending_Sector 0x0012 100 100 000 Old_age
Always -
0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
Offline -
0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
Always -
0
200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
Offline -
0
202 TA_Increase_Count 0x0032 100 253 000 Old_age
Always -
0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num Test_Description Status Remaining
LifeTime(hours) LBA
_of_first_error
# 1 Extended offline Completed without error 00% 10
-
# 2 Short offline Completed without error 00% 8
-
# 3 Short offline Completed without error 00% 8
-

SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute
delay.

Hope this information is of some use.

Active SMART giving me a TEC date can't be good, can it?


***
....the Phoenix shall rise...
Anonymous
a b G Storage
April 6, 2005 11:35:32 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Previously Phoenix AG <contact.me@in.the.ng> wrote:
> On 6 Apr 2005 11:33:54 GMT, Arno Wagner <me@privacy.net> wrote:


>>> Thank you for the answer :) 
>>
>>> My Reallocated Sector Ct is 100, Worst value is 100, Thresh is 36.
>>
>>Actually interesting is the raw number. (Last one on the right in
>>my sample output)
>>
>>> After checking just now with Active SMART, it seems another value has
>>> fluctuated. The Seek Error Rate, which was 100 earlier, has gone down
>>> to 61, Worst value is 60, Thresh is 30. Is this another thing to worry
>>> about?
>>
>>Might be. Is your PSU o.k. and is the drive firmly mounted?

> Well, I think so. On a separate note, my computer seems to crash
> whenever I run any 3d graphics program on it, like a game or
> something.

Aha!

> It's just a month old, actually...and I've been trying to
> sort out the problem. Have already changed my PSU 3 times, the RAM
> twice and the graphics card once. It went away for a while, but now is
> back in full force and I can't play any game at all without it
> crashing after a half hour.

That sounds very mach like overloaded PSU or inadequate cooling.
Both would affect a HDD as well. You should correct this problem
first, it may be the root cause.

What PSU do you have? And what CPU and graphics card?

> The drive seems firmly mounted.

Then that is not the cause.

> Also, Active SMART now gives me a T.E.C. date of May 2005 for the Raw
> Read Error rate. This definitely can't be good?

No idea, I don't use/know Active SMART.

>>> Which program did you use to generate those numbers? I can't seem to
>>> export a log from Active SMART. Maybe if I got the other program, I
>>> could post my SMART details and you could see them?
>>
>>"smartmontools", also ported to Windows. Available here:
>>
>>http://smartmontools.sourceforge.net/
>>
>>And the outoput above is not a log, bit the output.
>>
>>> Also, as earlier, the Raw read error rate and ECC count has fluctuated
>>> by a value of about 2. It seems to be going up, rather than down.
>>
>>Well, maybe there is some external factor causing this...
>>But not necessarily. Could just be variations in usage
>>pattern.
>>
>>Arno
>>

> Thanks for the reply, once again.

> Here is a complete log of the smartmontools output.

> === START OF INFORMATION SECTION ===
> Device Model: ST3120827AS
> Serial Number: 5MS07R09
> Firmware Version: 3.42
> User Capacity: 120,034,123,776 bytes
> Device is: Not in smartctl database [for details use: -P
> showall]
> ATA Version is: 6
> ATA Standard is: ATA/ATAPI-6 T13 1410D revision 2
> Local Time is: Wed Apr 06 19:31:48 2005 India Standard Time
> SMART support is: Available - device has SMART capability.
> SMART support is: Disabled

> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED

> General SMART Values:
> Offline data collection status: (0x82) Offline data collection
> activity
> was completed without error.
> Auto Offline Data Collection:
> Enabled.
> Self-test execution status: ( 0) The previous self-test routine
> completed

> without error or no self-test
> has ever
> been run.
> Total time to complete Offline
> data collection: ( 430) seconds.
> Offline data collection
> capabilities: (0x5b) SMART execute Offline
> immediate.
> Auto Offline data collection
> on/off supp
> ort.
> Suspend Offline collection
> upon new
> command.
> Offline surface scan
> supported.
> Self-test supported.
> No Conveyance Self-test
> supported.
> Selective Self-test supported.
> SMART capabilities: (0x0003) Saves SMART data before
> entering
> power-saving mode.
> Supports SMART auto save
> timer.
> Error logging capability: (0x01) Error logging supported.
> No General Purpose Logging
> support.
> Short self-test routine
> recommended polling time: ( 1) minutes.
> Extended self-test routine
> recommended polling time: ( 71) minutes.

> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_
> FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000f 061 052 006 Pre-fail
> Always -
> 228800583
> 3 Spin_Up_Time 0x0003 097 097 000 Pre-fail
> Always -
> 0
> 4 Start_Stop_Count 0x0032 100 100 020 Old_age
> Always -
> 6
> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail
> Always -
> 0
> 7 Seek_Error_Rate 0x000f 062 060 030 Pre-fail
> Always -
> 1750450
> 9 Power_On_Hours 0x0032 100 100 000 Old_age
> Always -
> 27
> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
> Always -
> 0
> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age
> Always -
> 8
> 194 Temperature_Celsius 0x0022 042 046 000 Old_age
> Always -
> 42 (Lifetime Min/Max 0/26)
> 195 Hardware_ECC_Recovered 0x001a 061 052 000 Old_age
> Always -
> 228800583
> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age
> Always -
> 0
> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age
> Offline -
> 0
> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age
> Always -
> 0
> 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
> Offline -
> 0
> 202 TA_Increase_Count 0x0032 100 253 000 Old_age
> Always -
> 0

> SMART Error Log Version: 1
> No Errors Logged

> SMART Self-test log structure revision number 1
> Num Test_Description Status Remaining
> LifeTime(hours) LBA
> _of_first_error
> # 1 Extended offline Completed without error 00% 10
> -
> # 2 Short offline Completed without error 00% 8
> -
> # 3 Short offline Completed without error 00% 8
> -

> SMART Selective self-test log data structure revision number 1
> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
> 1 0 0 Not_testing
> 2 0 0 Not_testing
> 3 0 0 Not_testing
> 4 0 0 Not_testing
> 5 0 0 Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute
> delay.

> Hope this information is of some use.

Looks pretty normal to me. Maybe the seek-error rate is a bit high.

Arno
Anonymous
a b G Storage
April 7, 2005 5:26:50 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On 6 Apr 2005 17:58:48 GMT, Arno Wagner <me@privacy.net> wrote:

>>>Might be. Is your PSU o.k. and is the drive firmly mounted?
>
>> Well, I think so. On a separate note, my computer seems to crash
>> whenever I run any 3d graphics program on it, like a game or
>> something.
>
>Aha!
>
>> It's just a month old, actually...and I've been trying to
>> sort out the problem. Have already changed my PSU 3 times, the RAM
>> twice and the graphics card once. It went away for a while, but now is
>> back in full force and I can't play any game at all without it
>> crashing after a half hour.
>
>That sounds very mach like overloaded PSU or inadequate cooling.
>Both would affect a HDD as well. You should correct this problem
>first, it may be the root cause.

Well, its a 350w PSU. I have 2 SATA drives, an Intel 3.0ghz 530J CPU.
And an ATI X700 Pro 128mb card. Also, a dvd writer and a cd writer. Do
you think thats a little overloaded?

I have 3 fans in the cabinet. Earlier, only 1 fan used to do its job
and keep the system cabinet temperature at about 36C (I have a cabinet
with an LCD on its front which tells me the temp). After I installed
the 2nd drive, it constantly goes to about 38C, after which the
cabinet's other 2 side fans take over and start running and cool it
down to about 35-36.

I have tested the cooling issue quite a lot. The system used to run
quite cool earlier and still crash.

They gave me the best PSU they had in the store, I don't know the
make. But I had them change it thrice, and they assured me that this
one was the best. So it can't be that bad. Can't be the Ram, got that
changed and tested. Got the graphics card changed from a Connect3D to
a Powercolor card. That worked, the crashing stopped, but now its back
in full force.

In fact, I ran an AVG virus scan on my system right now and then I did
it another time. Both times, the system crashed.

At first I thought it might be the onboard sound, so I tried running
3D tests with no sound and even disabled the onboard sound. But that
didn't stop the crashing.

My other HDD (Samsung) gives great SMART numbers, no changes ever.
Except for maybe the temperature a little here and there.

Actually, I am going a bit nuts with all this crashing. It can't be
the PSU, I had that changed a lot of times and tested. Same with ram,
gfx card. Only things I can think of now are the motherboard or the
CPU, very unlikely as it may seem. The motherboard is a Gigabyte.

Now Active SMART gives me a TEC Date of Apr 2005. Which means it
predicts the drive is gonna fail this month. LOL. I think I should
definitely get it changed.

It's been a hellish experience, getting a new comp :( 

>What PSU do you have? And what CPU and graphics card?
>
>> The drive seems firmly mounted.
>
>Then that is not the cause.
>
>> Also, Active SMART now gives me a T.E.C. date of May 2005 for the Raw
>> Read Error rate. This definitely can't be good?
>
>No idea, I don't use/know Active SMART.
>
>>>> Which program did you use to generate those numbers? I can't seem to
>>>> export a log from Active SMART. Maybe if I got the other program, I
>>>> could post my SMART details and you could see them?
>>>
>>>"smartmontools", also ported to Windows. Available here:
>>>
>>>http://smartmontools.sourceforge.net/
>>>
>>>And the outoput above is not a log, bit the output.
>>>
>>>> Also, as earlier, the Raw read error rate and ECC count has fluctuated
>>>> by a value of about 2. It seems to be going up, rather than down.
>>>
>>>Well, maybe there is some external factor causing this...
>>>But not necessarily. Could just be variations in usage
>>>pattern.
>>>
>>>Arno
>>>
>



***
....the Phoenix shall rise...
Anonymous
a b G Storage
April 7, 2005 5:26:51 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"Phoenix AG" <contact.me@in.the.ng> wrote in message
news:1112817413.35f2884a8088d576a632b0e39dd601b9@teranews...
> On 6 Apr 2005 17:58:48 GMT, Arno Wagner <me@privacy.net> wrote:
>
> >>>Might be. Is your PSU o.k. and is the drive firmly mounted?
> >
> >> Well, I think so. On a separate note, my computer seems to crash
> >> whenever I run any 3d graphics program on it, like a game or
> >> something.
> >
> >Aha!
> >
> >> It's just a month old, actually...and I've been trying to
> >> sort out the problem. Have already changed my PSU 3 times, the RAM
> >> twice and the graphics card once. It went away for a while, but now is
> >> back in full force and I can't play any game at all without it
> >> crashing after a half hour.
> >
> >That sounds very mach like overloaded PSU or inadequate cooling.
> >Both would affect a HDD as well. You should correct this problem
> >first, it may be the root cause.
>
> Well, its a 350w PSU. I have 2 SATA drives, an Intel 3.0ghz 530J CPU.
> And an ATI X700 Pro 128mb card. Also, a dvd writer and a cd writer. Do
> you think thats a little overloaded?
>
> I have 3 fans in the cabinet. Earlier, only 1 fan used to do its job
> and keep the system cabinet temperature at about 36C (I have a cabinet
> with an LCD on its front which tells me the temp). After I installed
> the 2nd drive, it constantly goes to about 38C, after which the
> cabinet's other 2 side fans take over and start running and cool it
> down to about 35-36.
>
> I have tested the cooling issue quite a lot. The system used to run
> quite cool earlier and still crash.
>
> They gave me the best PSU they had in the store, I don't know the
> make. But I had them change it thrice, and they assured me that this
> one was the best. So it can't be that bad. Can't be the Ram, got that
> changed and tested. Got the graphics card changed from a Connect3D to
> a Powercolor card. That worked, the crashing stopped, but now its back
> in full force.
>
> In fact, I ran an AVG virus scan on my system right now and then I did
> it another time. Both times, the system crashed.
>
> At first I thought it might be the onboard sound, so I tried running
> 3D tests with no sound and even disabled the onboard sound. But that
> didn't stop the crashing.
>
> My other HDD (Samsung) gives great SMART numbers, no changes ever.
> Except for maybe the temperature a little here and there.
>
> Actually, I am going a bit nuts with all this crashing. It can't be
> the PSU, I had that changed a lot of times and tested. Same with ram,
> gfx card. Only things I can think of now are the motherboard or the
> CPU, very unlikely as it may seem. The motherboard is a Gigabyte.
>
> Now Active SMART gives me a TEC Date of Apr 2005. Which means it
> predicts the drive is gonna fail this month. LOL. I think I should
> definitely get it changed.
>
> It's been a hellish experience, getting a new comp :( 
>
> >What PSU do you have? And what CPU and graphics card?
> >
> >> The drive seems firmly mounted.
> >
> >Then that is not the cause.
> >
> >> Also, Active SMART now gives me a T.E.C. date of May 2005 for the Raw
> >> Read Error rate. This definitely can't be good?
> >
> >No idea, I don't use/know Active SMART.
> >
> >>>> Which program did you use to generate those numbers? I can't seem to
> >>>> export a log from Active SMART. Maybe if I got the other program, I
> >>>> could post my SMART details and you could see them?
> >>>
> >>>"smartmontools", also ported to Windows. Available here:
> >>>
> >>>http://smartmontools.sourceforge.net/
> >>>
> >>>And the outoput above is not a log, bit the output.
> >>>
> >>>> Also, as earlier, the Raw read error rate and ECC count has
fluctuated
> >>>> by a value of about 2. It seems to be going up, rather than down.
> >>>
> >>>Well, maybe there is some external factor causing this...
> >>>But not necessarily. Could just be variations in usage
> >>>pattern.
> >>>
> >>>Arno
> >>>
> >
>
>
>
> ***
> ...the Phoenix shall rise...

according to http://www.jscustompcs.com/power_supply/ your system
should draw about 300W. What is make and model of PSU? Dodgy PSUs are often
from China and are very light in weight, as they skimp on heatsinks. What is
CPU temp? Open case and check cooler on graphics card is running and feel
temp of heatsink after crashes. If too hot to touch, is too hot. What is
spec/make of memory. Unbranded memory can be unreliable. Download Microsoft
memory tester and run it.Have you ovrtclocked system?
Mike.
Anonymous
a b G Storage
April 7, 2005 5:26:51 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Previously Phoenix AG <contact.me@in.the.ng> wrote:
> On 6 Apr 2005 17:58:48 GMT, Arno Wagner <me@privacy.net> wrote:

>>>>Might be. Is your PSU o.k. and is the drive firmly mounted?
>>
>>> Well, I think so. On a separate note, my computer seems to crash
>>> whenever I run any 3d graphics program on it, like a game or
>>> something.
>>
>>Aha!
>>
>>> It's just a month old, actually...and I've been trying to
>>> sort out the problem. Have already changed my PSU 3 times, the RAM
>>> twice and the graphics card once. It went away for a while, but now is
>>> back in full force and I can't play any game at all without it
>>> crashing after a half hour.
>>
>>That sounds very mach like overloaded PSU or inadequate cooling.
>>Both would affect a HDD as well. You should correct this problem
>>first, it may be the root cause.

> Well, its a 350w PSU. I have 2 SATA drives, an Intel 3.0ghz 530J CPU.
> And an ATI X700 Pro 128mb card. Also, a dvd writer and a cd writer. Do
> you think thats a little overloaded?

Might be. Especially if it is a low-quality PSU. In some tests
I have seen some did reach less than 80% of their stated load
before flailing. Also the PSU might have limits on 12V current
that is too low for your hardware. Especially older designs
are not able to supply the load ob 12V a modern #D card generates.

Also a PSU loaded nearly 100% is going to die fast. Usually
they are designed for something like 70% continuous load.

> I have 3 fans in the cabinet. Earlier, only 1 fan used to do its job
> and keep the system cabinet temperature at about 36C (I have a cabinet
> with an LCD on its front which tells me the temp). After I installed
> the 2nd drive, it constantly goes to about 38C, after which the
> cabinet's other 2 side fans take over and start running and cool it
> down to about 35-36.

> I have tested the cooling issue quite a lot. The system used to run
> quite cool earlier and still crash.

Your numbers sound reasonable. Does not seem to be an overheated system.

> They gave me the best PSU they had in the store, I don't know the
> make. But I had them change it thrice, and they assured me that this
> one was the best. So it can't be that bad.

I would not trust this statement. I am curious what these people
think is a good brand for PSUs. WHat it cannot be is a single
faulty PSU out of a series of good ones. Or the people in the shop
only told you they changed it but did nothing?

> Can't be the Ram, got that
> changed and tested. Got the graphics card changed from a Connect3D to
> a Powercolor card. That worked, the crashing stopped, but now its back
> in full force.

Typical for overload: The problem gets worse over time.

> In fact, I ran an AVG virus scan on my system right now and then I did
> it another time. Both times, the system crashed.

> At first I thought it might be the onboard sound, so I tried running
> 3D tests with no sound and even disabled the onboard sound. But that
> didn't stop the crashing.

No. Should not crash things.

> My other HDD (Samsung) gives great SMART numbers, no changes ever.
> Except for maybe the temperature a little here and there.

It may be more tolerant. But still, I think the Seagate is also running
reasonably. You should worry about the crashes first.

> Actually, I am going a bit nuts with all this crashing. It can't be
> the PSU, I had that changed a lot of times and tested.

It can. An average design run close to the design limit will
be unreliable. The 300W quote from Mike sounds reasonable.
With normal load only 70% of max. capacity you should use at
least a 380W quality PSU. For not so good PSUs you might have
to go up to 450W or 480W for a stable system.

> Same with ram,
> gfx card. Only things I can think of now are the motherboard or the
> CPU, very unlikely as it may seem. The motherboard is a Gigabyte.

> Now Active SMART gives me a TEC Date of Apr 2005. Which means it
> predicts the drive is gonna fail this month. LOL. I think I should
> definitely get it changed.

I think you should cure the crashes first. The drive may be
perfectly fine. As long as the crashes are there, the drive
may just suffer some collateral effects.

> It's been a hellish experience, getting a new comp :( 

Sometimes. Quality components are key. PSUs are often
overlooked, but I once read that something like 40% of
all electronics equipment failures in the US Navy are
bad switching-mode PSUs (the design used in the PC).
And with modern CPUs and 3D cards drawing a lot on the
12V line, the design requirements also have changed
significantly and some PSU manufacturers are slow to adapt.
Classically, a lot more load was on 5V.

Arno
Anonymous
a b G Storage
April 7, 2005 7:57:12 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Wed, 6 Apr 2005 22:25:10 +0100, "Michael Hawes"
<michael.hawes1remove@tiscali.co.uk> wrote:

>
>"Phoenix AG" <contact.me@in.the.ng> wrote in message
>news:1112817413.35f2884a8088d576a632b0e39dd601b9@teranews...
>> On 6 Apr 2005 17:58:48 GMT, Arno Wagner <me@privacy.net> wrote:
>>
>> >>>Might be. Is your PSU o.k. and is the drive firmly mounted?
>> >
>> >> Well, I think so. On a separate note, my computer seems to crash
>> >> whenever I run any 3d graphics program on it, like a game or
>> >> something.
>> >
>> >Aha!
>> >
>> >> It's just a month old, actually...and I've been trying to
>> >> sort out the problem. Have already changed my PSU 3 times, the RAM
>> >> twice and the graphics card once. It went away for a while, but now is
>> >> back in full force and I can't play any game at all without it
>> >> crashing after a half hour.
>> >
>> >That sounds very mach like overloaded PSU or inadequate cooling.
>> >Both would affect a HDD as well. You should correct this problem
>> >first, it may be the root cause.
>>
>> Well, its a 350w PSU. I have 2 SATA drives, an Intel 3.0ghz 530J CPU.
>> And an ATI X700 Pro 128mb card. Also, a dvd writer and a cd writer. Do
>> you think thats a little overloaded?
>>
>> I have 3 fans in the cabinet. Earlier, only 1 fan used to do its job
>> and keep the system cabinet temperature at about 36C (I have a cabinet
>> with an LCD on its front which tells me the temp). After I installed
>> the 2nd drive, it constantly goes to about 38C, after which the
>> cabinet's other 2 side fans take over and start running and cool it
>> down to about 35-36.
>>
>> I have tested the cooling issue quite a lot. The system used to run
>> quite cool earlier and still crash.
>>
>> They gave me the best PSU they had in the store, I don't know the
>> make. But I had them change it thrice, and they assured me that this
>> one was the best. So it can't be that bad. Can't be the Ram, got that
>> changed and tested. Got the graphics card changed from a Connect3D to
>> a Powercolor card. That worked, the crashing stopped, but now its back
>> in full force.
>>
>> In fact, I ran an AVG virus scan on my system right now and then I did
>> it another time. Both times, the system crashed.
>>
>> At first I thought it might be the onboard sound, so I tried running
>> 3D tests with no sound and even disabled the onboard sound. But that
>> didn't stop the crashing.
>>
>> My other HDD (Samsung) gives great SMART numbers, no changes ever.
>> Except for maybe the temperature a little here and there.
>>
>> Actually, I am going a bit nuts with all this crashing. It can't be
>> the PSU, I had that changed a lot of times and tested. Same with ram,
>> gfx card. Only things I can think of now are the motherboard or the
>> CPU, very unlikely as it may seem. The motherboard is a Gigabyte.
>>
>> Now Active SMART gives me a TEC Date of Apr 2005. Which means it
>> predicts the drive is gonna fail this month. LOL. I think I should
>> definitely get it changed.
>>
>> It's been a hellish experience, getting a new comp :( 
>>
>> >What PSU do you have? And what CPU and graphics card?
>> >
>> >> The drive seems firmly mounted.
>> >
>> >Then that is not the cause.
>> >
>> >> Also, Active SMART now gives me a T.E.C. date of May 2005 for the Raw
>> >> Read Error rate. This definitely can't be good?
>> >
>> >No idea, I don't use/know Active SMART.
>> >
>> >>>> Which program did you use to generate those numbers? I can't seem to
>> >>>> export a log from Active SMART. Maybe if I got the other program, I
>> >>>> could post my SMART details and you could see them?
>> >>>
>> >>>"smartmontools", also ported to Windows. Available here:
>> >>>
>> >>>http://smartmontools.sourceforge.net/
>> >>>
>> >>>And the outoput above is not a log, bit the output.
>> >>>
>> >>>> Also, as earlier, the Raw read error rate and ECC count has
>fluctuated
>> >>>> by a value of about 2. It seems to be going up, rather than down.
>> >>>
>> >>>Well, maybe there is some external factor causing this...
>> >>>But not necessarily. Could just be variations in usage
>> >>>pattern.
>> >>>
>> >>>Arno
>> >>>
>> >
>>
>>
>>
>> ***
>> ...the Phoenix shall rise...
>
> according to http://www.jscustompcs.com/power_supply/ your system
>should draw about 300W. What is make and model of PSU? Dodgy PSUs are often
>from China and are very light in weight, as they skimp on heatsinks. What is
>CPU temp? Open case and check cooler on graphics card is running and feel
>temp of heatsink after crashes. If too hot to touch, is too hot. What is
>spec/make of memory. Unbranded memory can be unreliable. Download Microsoft
>memory tester and run it.Have you ovrtclocked system?
> Mike.
>

Hi, thanks for the reply :) 

I am not sure about the make and model of the PSU. I will open up the
case tomorrow and let you guys know. I did think it was a dodgy PSU as
I have experienced crashes and etc with a bad PSU. But I have changed
it quite a lot now and still no difference. I guess I can ask for it
to be changed again.

The CPU temp on the normal load seems to be about 47-51 C. I haven't
checked it after playing a game. I will also check the CPU heatsink,
thats a great idea.
The fan on the graphics card is running fine.

The memory is PC3200 400mhz memory from Kingston. It's a 512MB stick.

I haven't overclocked anything. Everything is running at defaults.

In fact, to eliminate a software problem, I just formatted my system
recently.

I am now going to let it run for the night converting a load of mp3
files into a different format. Just to test if the CPU is the problem,
and not the graphics card.
Because now I can create a reproduceable crash when I try to run AVG's
system scan.

It feels so bad, having some games I want to play and having a good
system to play them. And not being able to play :(  Hehe.. :( 


***
....the Phoenix shall rise...
Anonymous
a b G Storage
April 8, 2005 6:20:23 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Fri, 8 Apr 2005 04:34:56 +1000, "Rod Speed" <rod_speed@yahoo.com>
wrote:

>Yeah, very likely given that you said elsewhere that its an Intel
>3.0ghz 530J CPU. They're a tad notorious for that currently.

Are you sure about this? You have any links on the internet where I
can check it out? Because I am going to look a little stupid trying to
convince the store guy that the 3.0 ghz CPU is buggy.

On a separate note, the Spin up time on the HDD dropped to 96.

Well, some guy told me to disable HT and see. So I disabled it and it
crashed after a really long time. Like, if it was crashing in 20
minutes on load, it took about an hour to crash.
I am also using a Gigabyte motherboard and now someone tells me that
those are also known to crash???

Is everything known to crash or have I just been really unlucky?


***
....the Phoenix shall rise...
Anonymous
a b G Storage
April 8, 2005 11:53:20 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Phoenix AG <contact.me@in.the.ng> wrote in message
news:1112907030.3c8693cb180a8cc6a2ca71ebfc9a7592@teranews...
> Rod Speed <rod_speed@yahoo.com> wrote

>> Yeah, very likely given that you said elsewhere that its an Intel
>> 3.0ghz 530J CPU. They're a tad notorious for that currently.

> Are you sure about this?

Yep.

> You have any links on the internet where I can check it out?
> Because I am going to look a little stupid trying to convince
> the store guy that the 3.0 ghz CPU is buggy.

Just monitor the cpu temp with something like SpeedFan,
it should be obvious that you are seeing the problem
when the cpu temp its getting too high.

The trick with those intel cpus is the inlet air temp. Its best to
get the air from the outside of the case with those, otherwise
the cpu temp does get too high when working hard when you
use the higher temp air from the inside of the case for the cpu.

> On a separate note, the Spin up time on the HDD dropped to 96.

> Well, some guy told me to disable HT and see. So I disabled
> it and it crashed after a really long time. Like, if it was crashing
> in 20 minutes on load, it took about an hour to crash.

It would be interesting to monitor the cpu temp in those two configs.

Dont bother with the bios temperature, thats not working the cpu.

> I am also using a Gigabyte motherboard and now
> someone tells me that those are also known to crash???

Its bullshit.

> Is everything known to crash or have I just been really unlucky?

The evidence you have is that working the cpu hard causes it
to crash, and I bet you will see a correlation with the cpu temp
and thats not hard to fix by using a case which allows outside
air to be used to cool the cpu.
Anonymous
a b G Storage
April 8, 2005 6:18:38 PM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Fri, 8 Apr 2005 07:53:20 +1000, "Rod Speed" <rod_speed@yahoo.com>
wrote:

>
>Phoenix AG <contact.me@in.the.ng> wrote in message
>news:1112907030.3c8693cb180a8cc6a2ca71ebfc9a7592@teranews...
>> Rod Speed <rod_speed@yahoo.com> wrote
>
>>> Yeah, very likely given that you said elsewhere that its an Intel
>>> 3.0ghz 530J CPU. They're a tad notorious for that currently.
>
>> Are you sure about this?
>
>Yep.
>
>> You have any links on the internet where I can check it out?
>> Because I am going to look a little stupid trying to convince
>> the store guy that the 3.0 ghz CPU is buggy.
>
>Just monitor the cpu temp with something like SpeedFan,
>it should be obvious that you are seeing the problem
>when the cpu temp its getting too high.

Thank you for the answer, everyone. I am sorry for taking this thread
off topic myself :) 
Well, I have great news :)  I finally managed to solve the problem :) 

Gigabyte has this utility called EasyTune to monitor the motherboard.
I used that. With HT, the temperature did not go above 59C while the
CPU was on full load (AVG virus scan, whole system + conversion of 500
mp3 files).
It crashed in 20 mins, while the temp was still at 59C. Which was
definitely weird, as I was so sure that the CPU was overheating.

Then I disabled HT and did the same test. Monitored the temp. It did
not go beyond 58C. Worked for 1 hour. I had almost thought I had
solved the problem when CRASH! It was really disappointing :( 

Then my BIOS has that CPU Enhanced Halt (C1E) setting. This is
supposed to power save and dissipate heat from the CPU. I
enabled/disabled this, nothing happened.

Finally, I searched around on google and I found a guy on some gamer
forum who was having EXACTLY the same problem I was. He solved it by
changing his motherboard. And I also read that the 530J was supported
only by v1002 of some motherboard.

So then I looked at the motherboad and tried to see what version it
was. Definitely not 1001 or anything along that. The version was F1,
while the current version on the Gigabyte site is F4. And F4 adds
support for EIST CPUs, whatever that means.

So I flashed the BIOS and turned on HT and everything. And it worked
:) 
After finishing to convert the files and running the whole scan, its
been running 3DMark03 in a constant loop the whole night. And it
hasn't crashed :) 

Thank you all for your suggestions. All of them helped in narrowing
down the problem.

>The trick with those intel cpus is the inlet air temp. Its best to
>get the air from the outside of the case with those, otherwise
>the cpu temp does get too high when working hard when you
>use the higher temp air from the inside of the case for the cpu.
>
>> On a separate note, the Spin up time on the HDD dropped to 96.
>
>> Well, some guy told me to disable HT and see. So I disabled
>> it and it crashed after a really long time. Like, if it was crashing
>> in 20 minutes on load, it took about an hour to crash.
>
>It would be interesting to monitor the cpu temp in those two configs.
>
>Dont bother with the bios temperature, thats not working the cpu.
>
>> I am also using a Gigabyte motherboard and now
>> someone tells me that those are also known to crash???
>
>Its bullshit.
>
>> Is everything known to crash or have I just been really unlucky?
>
>The evidence you have is that working the cpu hard causes it
>to crash, and I bet you will see a correlation with the cpu temp
>and thats not hard to fix by using a case which allows outside
>air to be used to cool the cpu.
>

Frankly, I couldn't have agreed with you more. If you look at it
logically, thats what it seems. The solution turned out to be quite
unexpected, but a pleasant surprise :) 

On the HDD, the Spin Up Time went to 97. Seek error rate went to 64.
And now it says TEC Date is Apr 2005 in front of Raw read error rate.
and that is 60, with a worst value of 52.

Should I wait and see if this HD fails? I already have my important
data on the other drive. Or should I exchange it immediately? Or
should I live with it because its normal?

Thanks :-)


***
....the Phoenix shall rise...
Anonymous
a b G Storage
April 9, 2005 8:09:59 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

"Phoenix AG" <contact.me@in.the.ng> wrote in message
news:1112950124.6b5662187c49f09a0e8001c6135869e0@teranews...
> On Fri, 8 Apr 2005 07:53:20 +1000, "Rod Speed" <rod_speed@yahoo.com>
> wrote:
>
>>
>>Phoenix AG <contact.me@in.the.ng> wrote in message
>>news:1112907030.3c8693cb180a8cc6a2ca71ebfc9a7592@teranews...
>>> Rod Speed <rod_speed@yahoo.com> wrote
>>
>>>> Yeah, very likely given that you said elsewhere that its an Intel
>>>> 3.0ghz 530J CPU. They're a tad notorious for that currently.
>>
>>> Are you sure about this?
>>
>>Yep.
>>
>>> You have any links on the internet where I can check it out?
>>> Because I am going to look a little stupid trying to convince
>>> the store guy that the 3.0 ghz CPU is buggy.
>>
>>Just monitor the cpu temp with something like SpeedFan,
>>it should be obvious that you are seeing the problem
>>when the cpu temp its getting too high.
>
> Thank you for the answer, everyone. I am sorry for taking this thread
> off topic myself :) 
> Well, I have great news :)  I finally managed to solve the problem :) 

> Gigabyte has this utility called EasyTune to monitor the motherboard.
> I used that. With HT, the temperature did not go above 59C while the
> CPU was on full load (AVG virus scan, whole system + conversion of 500
> mp3 files).
> It crashed in 20 mins, while the temp was still at 59C. Which was
> definitely weird, as I was so sure that the CPU was overheating.
>
> Then I disabled HT and did the same test. Monitored the temp. It did
> not go beyond 58C. Worked for 1 hour. I had almost thought I had
> solved the problem when CRASH! It was really disappointing :( 
>
> Then my BIOS has that CPU Enhanced Halt (C1E) setting. This is
> supposed to power save and dissipate heat from the CPU. I
> enabled/disabled this, nothing happened.
>
> Finally, I searched around on google and I found a guy on some gamer
> forum who was having EXACTLY the same problem I was. He solved it by
> changing his motherboard. And I also read that the 530J was supported
> only by v1002 of some motherboard.
>
> So then I looked at the motherboad and tried to see what version it
> was. Definitely not 1001 or anything along that. The version was F1,
> while the current version on the Gigabyte site is F4. And F4 adds
> support for EIST CPUs, whatever that means.
>
> So I flashed the BIOS and turned on HT and everything. And it worked
> :) 
> After finishing to convert the files and running the whole scan, its
> been running 3DMark03 in a constant loop the whole night. And it
> hasn't crashed :) 
>
> Thank you all for your suggestions. All of them helped in narrowing
> down the problem.

Thanks for the feedback, too rare IMO.

That also confirms why I avoid Gigabyte motherboard,
they release them to the field too early in my opinion,
you see FAR too many rev levels at the physical board
level and too many bios flashes too.

>>The trick with those intel cpus is the inlet air temp. Its best to
>>get the air from the outside of the case with those, otherwise
>>the cpu temp does get too high when working hard when you
>>use the higher temp air from the inside of the case for the cpu.
>>
>>> On a separate note, the Spin up time on the HDD dropped to 96.
>>
>>> Well, some guy told me to disable HT and see. So I disabled
>>> it and it crashed after a really long time. Like, if it was crashing
>>> in 20 minutes on load, it took about an hour to crash.
>>
>>It would be interesting to monitor the cpu temp in those two configs.
>>
>>Dont bother with the bios temperature, thats not working the cpu.
>>
>>> I am also using a Gigabyte motherboard and now
>>> someone tells me that those are also known to crash???
>>
>>Its bullshit.
>>
>>> Is everything known to crash or have I just been really unlucky?
>>
>>The evidence you have is that working the cpu hard causes it
>>to crash, and I bet you will see a correlation with the cpu temp
>>and thats not hard to fix by using a case which allows outside
>>air to be used to cool the cpu.

> Frankly, I couldn't have agreed with you more. If you look
> at it logically, thats what it seems. The solution turned out
> to be quite unexpected, but a pleasant surprise :) 

Yeah, its always logical when you do end up nailing it.

> On the HDD, the Spin Up Time went to 97. Seek error rate
> went to 64. And now it says TEC Date is Apr 2005 in front of
> Raw read error rate. and that is 60, with a worst value of 52.

> Should I wait and see if this HD fails?

I doubt they'll exchange it on that basis alone.

> I already have my important data on the other drive.

Then you should have it completely backed up.

Its never a good idea to rely on a hard drive not failing.

DVD burners are so cheap now that there is no excuse.

> Or should I exchange it immediately? Or
> should I live with it because its normal?

Not really possible to say because you'd need to know if that
sort of variation is common with that particular drive model.

I'd certainly be concerned, but it may be normal with that model.
Anonymous
a b G Storage
April 9, 2005 8:10:00 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Sat, 9 Apr 2005 04:09:59 +1000, "Rod Speed" <rod_speed@yahoo.com>
wrote:

>Thanks for the feedback, too rare IMO.
>
>That also confirms why I avoid Gigabyte motherboard,
>they release them to the field too early in my opinion,
>you see FAR too many rev levels at the physical board
>level and too many bios flashes too.

Yes, this is my first experience with Gigabyte's boards too. I have
always used Asus. Frankly, except for this Bios flashing, and the
trouble and grey hair inducing nightmares I've had with all this
crashing, I find it to be quite decent.
It has absolutely excellent features, has overclocking a baby can do,
support for DDR and DDR2 ram, support for RAID, in fact, support for
everything under the sun :) 
Flashing the Bios was really simple, specially considering I've never
done it before.

But yeah, they do need to test them out thoroughly before releasing
them.

>>>The trick with those intel cpus is the inlet air temp. Its best to
>>>get the air from the outside of the case with those, otherwise
>>>the cpu temp does get too high when working hard when you
>>>use the higher temp air from the inside of the case for the cpu.
>>>
>>>> On a separate note, the Spin up time on the HDD dropped to 96.
>>>
>>>> Well, some guy told me to disable HT and see. So I disabled
>>>> it and it crashed after a really long time. Like, if it was crashing
>>>> in 20 minutes on load, it took about an hour to crash.
>>>
>>>It would be interesting to monitor the cpu temp in those two configs.
>>>
>>>Dont bother with the bios temperature, thats not working the cpu.
>>>
>>>> I am also using a Gigabyte motherboard and now
>>>> someone tells me that those are also known to crash???
>>>
>>>Its bullshit.
>>>
>>>> Is everything known to crash or have I just been really unlucky?
>>>
>>>The evidence you have is that working the cpu hard causes it
>>>to crash, and I bet you will see a correlation with the cpu temp
>>>and thats not hard to fix by using a case which allows outside
>>>air to be used to cool the cpu.
>
>> Frankly, I couldn't have agreed with you more. If you look
>> at it logically, thats what it seems. The solution turned out
>> to be quite unexpected, but a pleasant surprise :) 
>
>Yeah, its always logical when you do end up nailing it.
>
>> On the HDD, the Spin Up Time went to 97. Seek error rate
>> went to 64. And now it says TEC Date is Apr 2005 in front of
>> Raw read error rate. and that is 60, with a worst value of 52.
>
>> Should I wait and see if this HD fails?
>
>I doubt they'll exchange it on that basis alone.

Oh, I am sure they will exchange it when it fails. What I am worried
about is whether they'll exchange it just on my word that its faulty.
After all, I am having no problems with it at all except for the SMART
errors.

>> I already have my important data on the other drive.
>
>Then you should have it completely backed up.
>
>Its never a good idea to rely on a hard drive not failing.
>
>DVD burners are so cheap now that there is no excuse.

Yeah, I do have a dvd burner. But somehow, due to fate's unkindness
towards me, it doesn't work very well. LOL. It doesn't seem to read
half the CDs and doesn't burn half the time. Otherwise, its a good
fast burner. I am planning to ditch it and buy a new dual layer one as
soon as I get some cash.

>> Or should I exchange it immediately? Or
>> should I live with it because its normal?
>
>Not really possible to say because you'd need to know if that
>sort of variation is common with that particular drive model.
>
>I'd certainly be concerned, but it may be normal with that model.
>

Yes, well, I am not sure what to do. I can go through the
unpleasantness of getting the drive exchanged, arguing with the
dealer...Or I can sit it out and use it...waiting for it to fail, if
it does.


***
....the Phoenix shall rise...
Anonymous
a b G Storage
April 9, 2005 9:54:27 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

Phoenix AG <contact.me@in.the.ng> wrote in message
news:1112987017.94d3ac4db3ed5aeb78f245f6a332ad5b@teranews...
> Rod Speed <rod_speed@yahoo.com> wrote

>> Thanks for the feedback, too rare IMO.

>> That also confirms why I avoid Gigabyte motherboards,
>> they release them to the field too early in my opinion,
>> you see FAR too many rev levels at the physical board
>> level and too many bios flashes too.

> Yes, this is my first experience with Gigabyte's
> boards too. I have always used Asus.

Yeah, that's what I normally use.

> Frankly, except for this Bios flashing, and the trouble
> and grey hair inducing nightmares I've had with all
> this crashing, I find it to be quite decent.

Sure, they normally do get their act into gear eventually, but
I'm not interested in wasting my time with the level of hassle
you experienced, they should have sorted that particular
problem out before it was released. It aint rocket science.

And should have made it a lot clearer that a later bios flash
fixed that particular problem too. In fact it should have been
announced very unambiguously indeed on the support page
for that particular motherboard so you could have saved
yourself an unbelievable amount of time by just checking
that when it was clear that the system had a problem
initially. In spades with your supplier.

> It has absolutely excellent features, has overclocking a
> baby can do, support for DDR and DDR2 ram, support
> for RAID, in fact, support for everything under the sun :) 

Sure, their product generally are quite good feature wise,
the problem is that they get released to early before all
the warts are excised, and they are much too coy about
admitting to problems that have been fixed.

> Flashing the Bios was really simple, specially
> considering I've never done it before.

> But yeah, they do need to test them
> out thoroughly before releasing them.

Yeah, no excuse for not testing that particular situation that bit you.

Presumably they do it that way to get the jump on the competition.

>>>>The trick with those intel cpus is the inlet air temp. Its best to
>>>>get the air from the outside of the case with those, otherwise
>>>>the cpu temp does get too high when working hard when you
>>>>use the higher temp air from the inside of the case for the cpu.
>>>>
>>>>> On a separate note, the Spin up time on the HDD dropped to 96.
>>>>
>>>>> Well, some guy told me to disable HT and see. So I disabled
>>>>> it and it crashed after a really long time. Like, if it was crashing
>>>>> in 20 minutes on load, it took about an hour to crash.
>>>>
>>>>It would be interesting to monitor the cpu temp in those two configs.
>>>>
>>>>Dont bother with the bios temperature, thats not working the cpu.
>>>>
>>>>> I am also using a Gigabyte motherboard and now
>>>>> someone tells me that those are also known to crash???
>>>>
>>>>Its bullshit.
>>>>
>>>>> Is everything known to crash or have I just been really unlucky?
>>>>
>>>>The evidence you have is that working the cpu hard causes it
>>>>to crash, and I bet you will see a correlation with the cpu temp
>>>>and thats not hard to fix by using a case which allows outside
>>>>air to be used to cool the cpu.
>>
>>> Frankly, I couldn't have agreed with you more. If you look
>>> at it logically, thats what it seems. The solution turned out
>>> to be quite unexpected, but a pleasant surprise :) 
>>
>>Yeah, its always logical when you do end up nailing it.
>>
>>> On the HDD, the Spin Up Time went to 97. Seek error rate
>>> went to 64. And now it says TEC Date is Apr 2005 in front of
>>> Raw read error rate. and that is 60, with a worst value of 52.
>>
>>> Should I wait and see if this HD fails?
>>
>>I doubt they'll exchange it on that basis alone.

> Oh, I am sure they will exchange it when it fails.

Yeah, I meant exchange it now with just those symptoms, before it fails.

> What I am worried about is whether they'll
> exchange it just on my word that its faulty.

Thats what I doubt they will do.

> After all, I am having no problems with
> it at all except for the SMART errors.

Yep, thats why they are unlikely to agree that its defective.

And I wouldnt just send it back to Seagate and just expect to
get a different drive back either, it could well be another drive
returned for the same reason and found to be fine by Seagate.

>>> I already have my important data on the other drive.

>> Then you should have it completely backed up.

>> Its never a good idea to rely on a hard drive not failing.

>> DVD burners are so cheap now that there is no excuse.

> Yeah, I do have a dvd burner. But somehow, due to fate's
> unkindness towards me, it doesn't work very well. LOL.

You were warned about that grave dancing, you wouldnt listen |-(

> It doesn't seem to read half the CDs and doesn't burn half the time.
> Otherwise, its a good fast burner. I am planning to ditch it and buy
> a new dual layer one as soon as I get some cash.

Sure, but I'd still backup that data with that one
given that the SMART data is a bit of a worry.

>>> Or should I exchange it immediately? Or
>>> should I live with it because its normal?

>> Not really possible to say because you'd need to know if that
>> sort of variation is common with that particular drive model.

>> I'd certainly be concerned, but it may be normal with that model.

> Yes, well, I am not sure what to do.

I'd backup the data with the current DVD burner, more than
one copy, and keep monitoring the SMART data myself.

> I can go through the unpleasantness of getting
> the drive exchanged, arguing with the dealer...

Only you can really say how likely it is that you
can monster them using just the SMART data.

I would do that if you can get another unused drive.

> Or I can sit it out and use it...waiting for it to fail, if it does.

I wouldnt do that without a backup of that data.
Even if I had to buy a new burner on the credit card.
Anonymous
a b G Storage
April 9, 2005 9:54:28 AM

Archived from groups: comp.sys.ibm.pc.hardware.storage (More info?)

On Sat, 9 Apr 2005 05:54:27 +1000, "Rod Speed" <rod_speed@yahoo.com>
wrote:

>
>Phoenix AG <contact.me@in.the.ng> wrote in message
>news:1112987017.94d3ac4db3ed5aeb78f245f6a332ad5b@teranews...
>> Rod Speed <rod_speed@yahoo.com> wrote
>
>>> Thanks for the feedback, too rare IMO.
>
>>> That also confirms why I avoid Gigabyte motherboards,
>>> they release them to the field too early in my opinion,
>>> you see FAR too many rev levels at the physical board
>>> level and too many bios flashes too.
>
>> Yes, this is my first experience with Gigabyte's
>> boards too. I have always used Asus.
>
>Yeah, that's what I normally use.
>
Yeah, they have always been rock solid, Asus. My old P4 1.6 runs
great, hasn't ever had a problem.

>> Frankly, except for this Bios flashing, and the trouble
>> and grey hair inducing nightmares I've had with all
>> this crashing, I find it to be quite decent.
>
>Sure, they normally do get their act into gear eventually, but
>I'm not interested in wasting my time with the level of hassle
>you experienced, they should have sorted that particular
>problem out before it was released. It aint rocket science.
>
>And should have made it a lot clearer that a later bios flash
>fixed that particular problem too. In fact it should have been
>announced very unambiguously indeed on the support page
>for that particular motherboard so you could have saved
>yourself an unbelievable amount of time by just checking
>that when it was clear that the system had a problem
>initially. In spades with your supplier.

Exactly. I would have expected it to say clearly what it updates and
fixes, the bios update. But, all it gave was a line or 2 about some
stupid acronym I've never heard of in my life.
Would have expected better.

>> It has absolutely excellent features, has overclocking a
>> baby can do, support for DDR and DDR2 ram, support
>> for RAID, in fact, support for everything under the sun :) 
>
>Sure, their product generally are quite good feature wise,
>the problem is that they get released to early before all
>the warts are excised, and they are much too coy about
>admitting to problems that have been fixed.
>
>> Flashing the Bios was really simple, specially
>> considering I've never done it before.
>
>> But yeah, they do need to test them
>> out thoroughly before releasing them.
>
>Yeah, no excuse for not testing that particular situation that bit you.
>
>Presumably they do it that way to get the jump on the competition.

Exactly :) 

>>>> Should I wait and see if this HD fails?
>>>
>>>I doubt they'll exchange it on that basis alone.
>
>> Oh, I am sure they will exchange it when it fails.
>
>Yeah, I meant exchange it now with just those symptoms, before it fails.

Yes. In fact, the more I think about it, the more I am sure they
won't. I'll take it back to them, he'll pop it into his computer, test
it and tell me its running fine. I am sure half of them won't know
what SMART even is.

>> What I am worried about is whether they'll
>> exchange it just on my word that its faulty.
>
>Thats what I doubt they will do.
>
>> After all, I am having no problems with
>> it at all except for the SMART errors.
>
>Yep, thats why they are unlikely to agree that its defective.
>
>And I wouldnt just send it back to Seagate and just expect to
>get a different drive back either, it could well be another drive
>returned for the same reason and found to be fine by Seagate.

Yeah, didn't think of that. Well, guess I am ok with this drive for
now.

>>>> I already have my important data on the other drive.
>
>>> Then you should have it completely backed up.
>
>>> Its never a good idea to rely on a hard drive not failing.
>
>>> DVD burners are so cheap now that there is no excuse.
>
>> Yeah, I do have a dvd burner. But somehow, due to fate's
>> unkindness towards me, it doesn't work very well. LOL.
>
>You were warned about that grave dancing, you wouldnt listen |-(

LOL. Well, 2005 has not been a very good year for my computers.
Everything seems to be a bit shaky. I'm gonna go tomorrow and get my
dvd burner fixed too, if they can do it. It's still under warranty so
hopefully they should. I actually won this one otherwise, I would
never have got it. It's a Samsung and they are notorious for not
reading CD-Rs and a lot of other media.

>> It doesn't seem to read half the CDs and doesn't burn half the time.
>> Otherwise, its a good fast burner. I am planning to ditch it and buy
>> a new dual layer one as soon as I get some cash.
>
>Sure, but I'd still backup that data with that one
>given that the SMART data is a bit of a worry.

Yeah, I've been backing up stuff. Not much to backup, though. It's my
2nd drive and my 2nd drive has mostly the dump of all things. Like
Joey episodes, some movies, some games, temp space for everything,
other misc stuff downloaded, games installed, etc.

It won't be that big a loss if this drive goes, but yes, it'll be
annoying because it'll mean a loss of a lot of time. And game saves :D 
Which is never good...

>>>> Or should I exchange it immediately? Or
>>>> should I live with it because its normal?
>
>>> Not really possible to say because you'd need to know if that
>>> sort of variation is common with that particular drive model.
>
>>> I'd certainly be concerned, but it may be normal with that model.
>
>> Yes, well, I am not sure what to do.
>
>I'd backup the data with the current DVD burner, more than
>one copy, and keep monitoring the SMART data myself.
>
>> I can go through the unpleasantness of getting
>> the drive exchanged, arguing with the dealer...
>
>Only you can really say how likely it is that you
>can monster them using just the SMART data.
>
>I would do that if you can get another unused drive.
>
>> Or I can sit it out and use it...waiting for it to fail, if it does.
>
>I wouldnt do that without a backup of that data.
>Even if I had to buy a new burner on the credit card.
>



***
....the Phoenix shall rise...
!