[plug] Ext3: attempt to access beyond end of device
Craig Ringer
craig at postnewspapers.com.au
Thu May 12 12:57:07 WST 2005
On Thu, 2005-05-12 at 12:17 +0800, Cameron Patrick wrote:
> # smartctl -a /dev/hd(whatever)
> [...]
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000f 057 051 006 Pre-fail Always - 127719436
> 3 Spin_Up_Time 0x0003 098 096 000 Pre-fail Always - 0
> 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 15
> 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
> 7 Seek_Error_Rate 0x000f 089 060 030 Pre-fail Always - 836514076
> 9 Power_On_Hours 0x0032 093 093 000 Old_age Always - 6883
> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 16
> 194 Temperature_Celsius 0x0022 036 048 000 Old_age Always - 36
> 195 Hardware_ECC_Recovered 0x001a 057 051 000 Old_age Always - 127719436
> 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
> 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
> 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
> 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
> 202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
> [...]
>
> You can also ask smartctl to get the drive to run a self test and
> various other things.
To run the test
$ smartctl -t short /dev/hda
To wait until it's finished:
$ sleep 10m
(or however long smartctl says it'll take)
then to look at the results:
$ smartctl -a /dev/hda
You can also use '-t long' if you find nothing wrong in the short test
but are suspicious, or if you're trying to do a comprehensive job. I
find the short test catches about 1/2 to 2/3 the bad disks I've run
into, the long test has caught every single one.
> One of the important fields here is
> Reallocated_Sector_Ct. If it's non-zero, you have a drive with bad
> sectors and might want to complate buying a new one.
Other critical variables are:
- Temperature_Celsius, if shown. You'll usually want the raw value.
Very high == bad.
- Offline_Uncorrectable (raw value). A non zero value here pretty
much means it's bin time for the disk in my experience, as every
disk I've had with a non-zero value here has been SERIOUSLY
failing.
- UDMA_CRC_Error_Count (raw value). I've seen this go through
the roof with bad cables and in one case a bad controller.
- Hardware_ECC_Recovered (raw value). This often seems to get very
high when the disk is going bad, especially due to heat.
- Spin_Retry_Count (raw value): A key indication that the drive
motor is failing is if this is non-zero (though 1 or 2 might be
OK because of bad power, etc).
I find `smartctl -H' to be essentially useless. At least with Western
Digital disks, it often reports PASSED on disks that are so utterly
screwed that even the partition table can't be read and that can't even
spin up half the time. Seagate disks seem a little more honest about
warning you when they think they might be dying, and I haven't had a
Maxtor die on me yet so I can't comment on those.
Here's the vendor table SMART dump from my two (AFAIK; haven't run tests
recently) healthy desktop disks. The output is in original form,
unwrapped, so if your mail client doesn't soft-wrap text client-side it
should look fine.
=== START OF INFORMATION SECTION ===
Device Model: Maxtor 6Y120P0
Serial Number: Y41GMYVE
Firmware Version: YAR41BW0
User Capacity: 122,942,324,736 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 7
ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
Local Time is: Thu May 12 12:48:42 2005 WST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
[...]
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
3 Spin_Up_Time 0x0027 202 191 063 Pre-fail Always - 12632
4 Start_Stop_Count 0x0032 253 253 000 Old_age Always - 760
5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail Always - 0
6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail Offline - 0
7 Seek_Error_Rate 0x000a 253 252 000 Old_age Always - 0
8 Seek_Time_Performance 0x0027 252 246 187 Pre-fail Always - 40822
9 Power_On_Minutes 0x0032 240 240 000 Old_age Always - 467h+36m
10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail Always - 0
11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 251 251 000 Old_age Always - 1181
192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age Always - 0
193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 0
194 Temperature_Celsius 0x0032 253 253 000 Old_age Always - 30
195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 1331
196 Reallocated_Event_Count 0x0008 253 253 000 Old_age Offline - 0
197 Current_Pending_Sector 0x0008 253 253 000 Old_age Offline - 0
198 Offline_Uncorrectable 0x0008 253 253 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age Offline - 0
200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0
201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 1
202 TA_Increase_Count 0x000a 253 252 000 Old_age Always - 0
203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 0
204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age Always - 0
205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always - 0
207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0
208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0
209 Offline_Seek_Performnce 0x0024 196 195 000 Old_age Offline - 0
99 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
100 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
101 Unknown_Attribute 0x0004 253 253 000 Old_age Offline - 0
[...]
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 1227 -
=== START OF INFORMATION SECTION ===
Device Model: WDC WD1200JB-75CRA0
Serial Number: WD-WMA8C2759841
Firmware Version: 16.06V16
User Capacity: 120,000,000,000 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 5
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Thu May 12 12:54:30 2005 WST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
[...]
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0007 102 095 021 Pre-fail Always - 5658
4 Start_Stop_Count 0x0032 099 099 040 Old_age Always - 1588
5 Reallocated_Sector_Ct 0x0033 199 199 140 Pre-fail Always - 5
7 Seek_Error_Rate 0x000b 100 253 051 Pre-fail Always - 0
9 Power_On_Hours 0x0032 086 086 000 Old_age Always - 10716
10 Spin_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0
11 Calibration_Retry_Count 0x0013 100 100 051 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 000 Old_age Always - 1512
196 Reallocated_Event_Count 0x0032 196 196 000 Old_age Always - 4
197 Current_Pending_Sector 0x0012 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0012 200 200 000 Old_age Always - 0
199 UDMA_CRC_Error_Count 0x000a 200 253 000 Old_age Always - 4
200 Multi_Zone_Error_Rate 0x0009 200 200 051 Pre-fail Offline - 0
--
Craig Ringer
More information about the plug
mailing list