[plug] server failing with bizarre disk errors
Jon Miller
jlmiller at mmtnetworks.com.au
Wed Apr 9 19:40:22 WST 2003
lol, one good point about SCSI and RAIDS they do last a longtime.
Jon L. Miller, MCNE, CNS
Director/Sr Systems Consultant
MMT Networks Pty Ltd
http://www.mmtnetworks.com.au
"I don't know the key to success, but the key to failure
is trying to please everybody." -Bill Cosby
>>> craig at postnewspapers.com.au 7:02:21 PM 9/04/2003 >>>
Jon: looks like you might be right about the bad disk after all. I
finially thought of running the SMART diagnotics:
( 5)Reallocated Sector Ct 0x0033 196 196 140 54
on hda. That's not death on a HDD but it sure isn't right, given:
( 4)Start Stop Count 0x0032 100 100 040 34
( 12)Power Cycle Count 0x0032 100 100 000 33
( 9)Power On Hours 0x0032 099 099 000 1332
Lets see.... pretty close to one bad sector PER DAY over the drive's
(very short) lifetime.
I can only guess that hdb has problems due to a read on hda blocking
access to the ATA bus, and that's what is causing the apparent multiple
failure. That's the only thing I can think of that could cause the
random distribution of errors between hda and hdb.
Oh well, I needed a RAID array anyway.
-----------------------
access:/home/craig# smartctl -v /dev/hda
Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute Flag Value Worst Threshold Raw Value
( 1)Raw Read Error Rate 0x000b 200 200 051 0
( 3)Spin Up Time 0x0007 172 161 021 4408
( 4)Start Stop Count 0x0032 100 100 040 34
( 5)Reallocated Sector Ct 0x0033 196 196 140 54
( 7)Seek Error Rate 0x000b 100 253 051 0
( 9)Power On Hours 0x0032 099 099 000 1332
( 10)Spin Retry Count 0x0013 100 253 051 0
( 11)Calibration Retry Count 0x0013 100 253 051 0
( 12)Power Cycle Count 0x0032 100 100 000 33
(196)Reallocated Event Count 0x0032 199 199 000 1
(197)Current Pending Sector 0x0012 200 200 000 0
(198)Offline Uncorrectable 0x0012 200 200 000 0
(199)UDMA CRC Error Count 0x000a 200 253 000 0
(200)Unknown Attribute 0x0009 200 200 051 0
access:/home/craig# smartctl -v /dev/hdb
Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute Flag Value Worst Threshold Raw Value
( 1)Raw Read Error Rate 0x000b 200 200 051 0
( 3)Spin Up Time 0x0007 113 100 021 3408
( 4)Start Stop Count 0x0032 100 100 040 64
( 5)Reallocated Sector Ct 0x0033 200 200 140 0
( 7)Seek Error Rate 0x000b 200 200 051 0
( 9)Power On Hours 0x0032 097 097 000 2357
( 10)Spin Retry Count 0x0013 100 253 051 0
( 11)Calibration Retry Count 0x0013 100 253 051 0
( 12)Power Cycle Count 0x0032 100 100 000 62
(196)Reallocated Event Count 0x0032 200 200 000 0
(197)Current Pending Sector 0x0012 200 200 000 0
(198)Offline Uncorrectable 0x0012 200 200 000 0
(199)UDMA CRC Error Count 0x000a 200 253 000 0
(200)Unknown Attribute 0x0009 200 200 051 0
More information about the plug
mailing list