[plug] server failing with bizarre disk errors
Craig Ringer
craig at postnewspapers.com.au
Wed Apr 9 19:02:21 WST 2003
Jon: looks like you might be right about the bad disk after all. I
finially thought of running the SMART diagnotics:
( 5)Reallocated Sector Ct 0x0033 196 196 140 54
on hda. That's not death on a HDD but it sure isn't right, given:
( 4)Start Stop Count 0x0032 100 100 040 34
( 12)Power Cycle Count 0x0032 100 100 000 33
( 9)Power On Hours 0x0032 099 099 000 1332
Lets see.... pretty close to one bad sector PER DAY over the drive's
(very short) lifetime.
I can only guess that hdb has problems due to a read on hda blocking
access to the ATA bus, and that's what is causing the apparent multiple
failure. That's the only thing I can think of that could cause the
random distribution of errors between hda and hdb.
Oh well, I needed a RAID array anyway.
-----------------------
access:/home/craig# smartctl -v /dev/hda
Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute Flag Value Worst Threshold Raw Value
( 1)Raw Read Error Rate 0x000b 200 200 051 0
( 3)Spin Up Time 0x0007 172 161 021 4408
( 4)Start Stop Count 0x0032 100 100 040 34
( 5)Reallocated Sector Ct 0x0033 196 196 140 54
( 7)Seek Error Rate 0x000b 100 253 051 0
( 9)Power On Hours 0x0032 099 099 000 1332
( 10)Spin Retry Count 0x0013 100 253 051 0
( 11)Calibration Retry Count 0x0013 100 253 051 0
( 12)Power Cycle Count 0x0032 100 100 000 33
(196)Reallocated Event Count 0x0032 199 199 000 1
(197)Current Pending Sector 0x0012 200 200 000 0
(198)Offline Uncorrectable 0x0012 200 200 000 0
(199)UDMA CRC Error Count 0x000a 200 253 000 0
(200)Unknown Attribute 0x0009 200 200 051 0
access:/home/craig# smartctl -v /dev/hdb
Vendor Specific SMART Attributes with Thresholds:
Revision Number: 16
Attribute Flag Value Worst Threshold Raw Value
( 1)Raw Read Error Rate 0x000b 200 200 051 0
( 3)Spin Up Time 0x0007 113 100 021 3408
( 4)Start Stop Count 0x0032 100 100 040 64
( 5)Reallocated Sector Ct 0x0033 200 200 140 0
( 7)Seek Error Rate 0x000b 200 200 051 0
( 9)Power On Hours 0x0032 097 097 000 2357
( 10)Spin Retry Count 0x0013 100 253 051 0
( 11)Calibration Retry Count 0x0013 100 253 051 0
( 12)Power Cycle Count 0x0032 100 100 000 62
(196)Reallocated Event Count 0x0032 200 200 000 0
(197)Current Pending Sector 0x0012 200 200 000 0
(198)Offline Uncorrectable 0x0012 200 200 000 0
(199)UDMA CRC Error Count 0x000a 200 253 000 0
(200)Unknown Attribute 0x0009 200 200 051 0
More information about the plug
mailing list