Offline uncorrectable sectors on hard drive
Patrick Proniewski
patpro at patpro.net
Sun Oct 4 08:45:15 UTC 2009
Hi all,
smartctl on my server returned an error yesterday :
> The following warning/error was logged by the smartd daemon:
>
> Device: /dev/ad6, 1 Offline uncorrectable sectors
/var/log/messages :
> Oct 3 04:04:19 rack smartd[739]: Device: /dev/ad6, 1 Offline
> uncorrectable sectors
> Oct 3 04:04:19 rack smartd[739]: Device: /dev/ad6, Self-Test Log
> error count increased from 0 to 1
../..
> Oct 4 01:34:19 rack smartd[739]: Device: /dev/ad6, 1 Offline
> uncorrectable sectors
> Oct 4 02:04:19 rack smartd[739]: Device: /dev/ad6, 1 Offline
> uncorrectable sectors
first error flagged Oct 3 04:04:19, last error Oct 4 02:04:19. Since
then, no more error in the logs.
So my questions are:
- is that "Offline uncorrectable sectors" in fact correctable? I've
found howto's for extfs and reiserfs to correct this kind of error by
remapping bad sectors, but nothing for UFS.
- this error appears to have disappeared, does it mean the harddrive
(or the fs) made the remapping by it self?
Here is the smartctl -a output for this device:
> # smartctl -a /dev/ad6
> smartctl version 5.38 [i386-portbld-freebsd6.4] Copyright (C) 2002-8
> Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> === START OF INFORMATION SECTION ===
> Model Family: Maxtor DiamondMax 10 family (ATA/133 and SATA/150)
> Device Model: Maxtor 6L200M0
> Serial Number: L404EDDH
> Firmware Version: BANC1E00
> User Capacity: 203,928,109,056 bytes
> Device is: In smartctl database [for details use: -P show]
> ATA Version is: 7
> ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
> Local Time is: Sun Oct 4 10:23:02 2009 CEST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> Warning! SMART Attribute Thresholds Structure error: invalid SMART
> checksum.
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status: (0x02) Offline data collection
> activity
> was completed without error.
> Auto Offline Data Collection: Disabled.
> Self-test execution status: ( 0) The previous self-test
> routine completed
> without error or no self-test has ever
> been run.
> Total time to complete Offline
> data collection: (1562) seconds.
> Offline data collection
> capabilities: (0x5b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> No Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities: (0x0003) Saves SMART data before
> entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability: (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: ( 2) minutes.
> Extended self-test routine
> recommended polling time: ( 81) minutes.
> SCT capabilities: (0x0021) SCT Status supported.
> SCT Data Table supported.
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 3 Spin_Up_Time 0x0027 252 252 063 Pre-fail
> Always - 3148
> 4 Start_Stop_Count 0x0032 253 253 000 Old_age
> Always - 17
> 5 Reallocated_Sector_Ct 0x0033 253 253 063 Pre-fail
> Always - 0
> 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail
> Offline - 0
> 7 Seek_Error_Rate 0x000a 253 252 000 Old_age
> Always - 0
> 8 Seek_Time_Performance 0x0027 250 234 187 Pre-fail
> Always - 51522
> 9 Power_On_Minutes 0x0032 154 154 000 Old_age
> Always - 487h+21m
> 10 Spin_Retry_Count 0x002b 252 252 157 Pre-fail
> Always - 0
> 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail
> Always - 0
> 12 Power_Cycle_Count 0x0032 253 253 000 Old_age
> Always - 20
> 192 Power-Off_Retract_Count 0x0032 253 253 000 Old_age
> Always - 0
> 193 Load_Cycle_Count 0x0032 253 253 000 Old_age
> Always - 0
> 194 Temperature_Celsius 0x0032 023 253 000 Old_age
> Always - 26
> 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age
> Always - 7879
> 196 Reallocated_Event_Count 0x0008 251 251 000 Old_age
> Offline - 2
> 197 Current_Pending_Sector 0x0008 253 253 000 Old_age
> Offline - 0
> 198 Offline_Uncorrectable 0x0008 253 252 000 Old_age
> Offline - 0
> 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age
> Offline - 0
> 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age
> Always - 0
> 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age
> Always - 0
> 202 TA_Increase_Count 0x000a 253 252 000 Old_age
> Always - 0
> 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail
> Always - 0
> 204 Shock_Count_Write_Opern 0x000a 253 252 000 Old_age
> Always - 0
> 205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age
> Always - 0
> 207 Spin_High_Current 0x002a 252 252 000 Old_age
> Always - 0
> 208 Spin_Buzz 0x002a 252 252 000 Old_age
> Always - 0
> 209 Offline_Seek_Performnce 0x0024 239 239 000 Old_age
> Offline - 170
> 210 Unknown_Attribute 0x0032 253 252 000 Old_age
> Always - 0
> 211 Unknown_Attribute 0x0032 253 252 000 Old_age
> Always - 0
> 212 Unknown_Attribute 0x0032 253 252 000 Old_age
> Always - 0
>
> Warning! SMART ATA Error Log Structure error: invalid SMART checksum.
> SMART Error Log Version: 1
> No Errors Logged
>
> SMART Self-test log structure revision number 1
> Num Test_Description Status Remaining
> LifeTime(hours) LBA_of_first_error
> # 1 Short offline Completed without error 00%
> 34339 -
> # 2 Extended offline Completed: read failure 20%
> 34317 201626851
> # 3 Short offline Completed without error 00%
> 34315 -
> # 4 Short offline Completed without error 00%
> 34291 -
> # 5 Short offline Completed without error 00%
> 34267 -
> # 6 Short offline Completed without error 00%
> 34243 -
> # 7 Short offline Completed without error 00%
> 34220 -
> # 8 Short offline Completed without error 00%
> 34196 -
> # 9 Short offline Completed without error 00%
> 34172 -
> #10 Extended offline Completed without error 00%
> 34150 -
> #11 Short offline Completed without error 00%
> 34148 -
> #12 Short offline Completed without error 00%
> 34124 -
> #13 Short offline Completed without error 00%
> 34100 -
> #14 Short offline Completed without error 00%
> 34076 -
> #15 Short offline Completed without error 00%
> 34053 -
> #16 Short offline Completed without error 00%
> 34029 -
> #17 Short offline Completed without error 00%
> 34005 -
> #18 Extended offline Completed without error 00%
> 33983 -
> #19 Short offline Completed without error 00%
> 33981 -
> #20 Short offline Completed without error 00%
> 33957 -
> #21 Short offline Completed without error 00%
> 33933 -
>
> SMART Selective self-test log data structure revision number 1
> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
> 1 0 0 Not_testing
> 2 0 0 Not_testing
> 3 0 0 Not_testing
> 4 0 0 Not_testing
> 5 0 0 Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute
> delay.
thanks,
patpro
More information about the freebsd-fs
mailing list