ZFS raidz2, errors in file?
Dennis Glatting
freebsd at pki2.com
Thu Oct 18 01:28:10 UTC 2012
On Wed, 2012-10-17 at 20:39 +0300, Heikki Suonsivu wrote:
> Summary: This could indicate that in some situations ZFS or FreeBSD
> ignore disk errors in some situations, or disks do not pass them to
> FreeBSD.
>
> There are three SCSI timeouts all on different disks reported, though it
> seems unlikely they hit all three redundancies of a single block of
> single file, so the more likely scenario would be that there are other
> errors causing this, which are not reported or are ignored. Also,
> timeouts should trigger retries.
>
> SMART data indicates problems on two other disks, but no indication of
> those are seen in logs (the disks work, but SMART information indicates
> problems). The fs is 8*3T pool on raidz2.
>
I can't say about your disks but I had nine 4TB Hitachi disks suffering
a 30% failure rate in their first year and consistently, and randomly,
being marked off-line by ZFS in a RAIDz2 pool. I eventually stripped
those disks from that role, replacing them with 3TB Seagates (the
Hitachies appear to work find in a ZFS mirror).
I haven't experienced the same failure since I replaced the Hitachies
with Seagate.
> Further data below:
>
> ------------------------------------------------------------------------
> zpool status -v
> pool: raid6
> state: ONLINE
> status: One or more devices has experienced an error resulting in data
> corruption. Applications may be affected.
> action: Restore the file in question if possible. Otherwise restore the
> entire pool from backup.
> see: http://www.sun.com/msg/ZFS-8000-8A
> scan: none requested
> config:
>
> NAME STATE READ WRITE CKSUM
> raid6 ONLINE 0 0 0
> raidz2-0 ONLINE 0 0 0
> da0 ONLINE 0 0 0
> da1 ONLINE 0 0 0
> da2 ONLINE 0 0 0
> da3 ONLINE 0 0 0
> da4 ONLINE 0 0 0
> da5 ONLINE 0 0 0
> da6 ONLINE 0 0 0
> da7 ONLINE 0 0 0
>
> errors: Permanent errors have been detected in the following files:
>
> /raid6/backup/rsnapshot/daily.1/foobar.net/users/mail/foo
> ------------------------------------------------------------------------
>
> All disks online, so this is kind of odd.
>
> One disk indeed has pending sector, not unusual and should be survivable:
>
> ------------------------------------------------------------------------
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
> Always - 0
> 3 Spin_Up_Time 0x0027 162 154 021 Pre-fail
> Always - 8891
> 4 Start_Stop_Count 0x0032 100 100 000 Old_age
> Always - 46
> 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x002e 100 253 000 Old_age
> Always - 0
> 9 Power_On_Hours 0x0032 094 094 000 Old_age
> Always - 4776
> 10 Spin_Retry_Count 0x0032 100 253 000 Old_age
> Always - 0
> 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age
> Always - 43
> 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always
> - 40
> 193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always
> - 13595
> 194 Temperature_Celsius 0x0022 119 106 000 Old_age Always
> - 33
> 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always
> - 0
> 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always
> - 1
> 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age
> Offline - 1
> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always
> - 0
> 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age
> Offline - 1
>
> SMART Error Log Version: 1
> No Errors Logged
>
> SMART Self-test log structure revision number 1
> No self-tests have been logged. [To run self-tests, use: smartctl -t]
> ------------------------------------------------------------------------
>
> But there is no corresponding read error in /var/log/messages, which
> would indicate that FreeBSD never saw the error, or ignored it.
>
> In addition, there seems to be ICRC DMA errors on da0. Looks nasty, but
> only show up in SMART log, not in /var/log/messages.
>
> ------------------------------------------------------------------------
> smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.0-RELEASE-p3 amd64] (local build)
> Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net
>
> === START OF INFORMATION SECTION ===
> Model Family: Hitachi Deskstar 7K3000
> Device Model: Hitachi HDS723030ALA640
> Serial Number: MK0331YHGVDK6A
> LU WWN Device Id: 5 000cca 225cc00c7
> Firmware Version: MKAOA5C0
> User Capacity: 3,000,592,982,016 bytes [3.00 TB]
> Sector Size: 512 bytes logical/physical
> Device is: In smartctl database [for details use: -P show]
> ATA Version is: 8
> ATA Standard is: ATA-8-ACS revision 4
> Local Time is: Wed Oct 17 20:03:05 2012 EEST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status: (0x84) Offline data collection activity
> was suspended by an
> interrupting command from host.
> Auto Offline Data Collection:
> Enabled.
> Self-test execution status: ( 0) The previous self-test routine
> completed
> without error or no self-test
> has ever
> been run.
> Total time to complete Offline
> data collection: (29509) seconds.
> Offline data collection
> capabilities: (0x5b) SMART execute Offline immediate.
> Auto Offline data collection
> on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> No Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities: (0x0003) Saves SMART data before entering
> power-saving mode.
> Supports SMART auto save timer.
> Error logging capability: (0x01) Error logging supported.
> General Purpose Logging supported.
> Short self-test routine
> recommended polling time: ( 1) minutes.
> Extended self-test routine
> recommended polling time: ( 492) minutes.
> SCT capabilities: (0x003d) SCT Status supported.
> SCT Error Recovery Control
> supported.
> SCT Feature Control supported.
> SCT Data Table supported.
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail
> Always - 0
> 2 Throughput_Performance 0x0005 136 136 054 Pre-fail
> Offline - 83
> 3 Spin_Up_Time 0x0007 125 125 024 Pre-fail
> Always - 617 (Average 617)
> 4 Start_Stop_Count 0x0012 100 100 000 Old_age
> Always - 62
> 5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail
> Always - 0
> 8 Seek_Time_Performance 0x0005 135 135 020 Pre-fail
> Offline - 26
> 9 Power_On_Hours 0x0012 100 100 000 Old_age
> Always - 4570
> 10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age
> Always - 30
> 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always
> - 129
> 193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always
> - 129
> 194 Temperature_Celsius 0x0002 166 166 000 Old_age Always
> - 36 (Min/Max 25/50)
> 196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always
> - 0
> 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always
> - 0
> 198 Offline_Uncorrectable 0x0008 100 100 000 Old_age
> Offline - 0
> 199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always
> - 112
>
> SMART Error Log Version: 1
> ATA Error Count: 112 (device log contains only the most recent five errors)
> CR = Command Register [HEX]
> FR = Features Register [HEX]
> SC = Sector Count Register [HEX]
> SN = Sector Number Register [HEX]
> CL = Cylinder Low Register [HEX]
> CH = Cylinder High Register [HEX]
> DH = Device/Head Register [HEX]
> DC = Device Command Register [HEX]
> ER = Error register [HEX]
> ST = Status register [HEX]
> Powered_Up_Time is measured from power on, and printed as
> DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
> SS=sec, and sss=millisec. It "wraps" after 49.710 days.
>
> Error 112 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
> When the command that caused the error occurred, the device was
> active or idle.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 84 51 20 28 c5 ae 06 Error: ICRC, ABRT 32 sectors at LBA =
> 0x06aec528 = 112117032
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> 35 03 f0 58 c4 ae 40 00 2d+04:00:39.416 WRITE DMA EXT
> 35 03 20 38 c4 ae 40 00 2d+04:00:39.409 WRITE DMA EXT
> 35 03 f0 48 c3 ae 40 00 2d+04:00:39.402 WRITE DMA EXT
> 35 03 f0 58 c2 ae 40 00 2d+04:00:39.394 WRITE DMA EXT
> 35 03 60 f8 c1 ae 40 00 2d+04:00:39.388 WRITE DMA EXT
>
> Error 111 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
> When the command that caused the error occurred, the device was
> active or idle.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 84 51 30 08 c4 90 06 Error: ICRC, ABRT 48 sectors at LBA =
> 0x0690c408 = 110150664
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> 35 03 f0 48 c3 90 40 00 2d+03:59:29.925 WRITE DMA EXT
> 35 03 f0 58 c2 90 40 00 2d+03:59:29.917 WRITE DMA EXT
> 35 03 60 f8 c1 90 40 00 2d+03:59:29.910 WRITE DMA EXT
> 35 03 b0 48 c1 90 40 00 2d+03:59:29.907 WRITE DMA EXT
> 35 03 f0 58 c0 90 40 00 2d+03:59:29.899 WRITE DMA EXT
>
> Error 110 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
> When the command that caused the error occurred, the device was
> active or idle.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 84 51 20 28 47 90 05 Error: ICRC, ABRT 32 sectors at LBA =
> 0x05904728 = 93341480
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> 35 da f0 58 46 90 40 00 2d+03:49:28.919 WRITE DMA EXT
> 35 da 20 38 46 90 40 00 2d+03:49:28.917 WRITE DMA EXT
> 35 da f0 48 45 90 40 00 2d+03:49:28.909 WRITE DMA EXT
> 35 da f0 58 44 90 40 00 2d+03:49:28.901 WRITE DMA EXT
> 35 da 20 38 44 90 40 00 2d+03:49:28.899 WRITE DMA EXT
>
> Error 109 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
> When the command that caused the error occurred, the device was
> active or idle.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 84 51 10 38 e7 41 05 Error: ICRC, ABRT 16 sectors at LBA =
> 0x0541e738 = 88205112
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> 35 03 f0 58 e6 41 40 00 2d+03:46:20.939 WRITE DMA EXT
> 35 03 20 38 e6 41 40 00 2d+03:46:20.933 WRITE DMA EXT
> 35 03 f0 48 e5 41 40 00 2d+03:46:20.926 WRITE DMA EXT
> 35 03 f0 58 e4 41 40 00 2d+03:46:20.921 WRITE DMA EXT
> 35 03 20 38 e4 41 40 00 2d+03:46:20.916 WRITE DMA EXT
>
> Error 108 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
> When the command that caused the error occurred, the device was
> active or idle.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 84 51 10 28 0a 2f 05 Error: ICRC, ABRT 16 sectors at LBA =
> 0x052f0a28 = 86968872
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> 35 03 f0 48 09 2f 40 00 2d+03:45:06.250 WRITE DMA EXT
> 35 03 f0 58 08 2f 40 00 2d+03:45:06.242 WRITE DMA EXT
> 35 03 20 38 08 2f 40 00 2d+03:45:06.238 WRITE DMA EXT
> 35 03 f0 48 07 2f 40 00 2d+03:45:06.231 WRITE DMA EXT
> 35 03 08 40 bc 25 40 00 2d+03:45:06.225 WRITE DMA EXT
>
> SMART Self-test log structure revision number 1
> No self-tests have been logged. [To run self-tests, use: smartctl -t]
>
>
> SMART Selective self-test log data structure revision number 1
> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
> 1 0 0 Not_testing
> 2 0 0 Not_testing
> 3 0 0 Not_testing
> 4 0 0 Not_testing
> 5 0 0 Not_testing
> Selective self-test flags (0x0):
> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.
> ------------------------------------------------------------------------
>
> These only show up in disk's SMART log, not as errors in other logs
> (logs exists for all the history of the system). These do not seem to
> have passed to the FreeBSD, or FreeBSD has ignored the errors.
>
> From logs, I can find da3, da4, da6 all have reported scsi timeout once
> (different times, within one month in July.
>
> They all look like this, one case for each of the disks:
>
> ------------------------------------------------------------------------
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 699
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 419
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 359
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 472
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 124
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 762
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 243
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 890
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 204
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 2
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on
> device handle 0x000d SMID 291
> Jul 24 12:16:55 tila kernel: mps0: (0:6:0) terminated ioc 804b scsi 0
> state c xfer 0
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 699 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 419
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 419 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 359
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 359 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 472
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 472 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 124
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 124 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 762
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 762 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 243
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 243 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 890
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 890 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 204
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 204 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 2
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 2 complete
> Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending
> deferred task management request for handle 0x0d SMID 291
> Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request
> on handle 0x0d SMID 291 complete
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): WRITE(16). CDB: 8a 0 0 0
> 0 1 4a 17 dc 5e 0 0 0 57 0 0
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): CAM status: SCSI Status Error
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI status: Check Condition
> Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI sense: UNIT
> ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
> Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): WRITE(10). CDB: 2a 0 28 0
> 85 fb 0 0 1 0
> Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): CAM status: SCSI Status Error
> Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): SCSI status: Check Condition
> Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): SCSI sense: UNIT
> ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
> ------------------------------------------------------------------------
>
> If the da0 ICRC errors would have been seen by ZFS, it should have made
> a) note of that in some log? b) retried write? c) Something else? If
> we assume that the disk firmware is broken and does not report these to
> OS, so da0 might be corrupt. But that should still be ok with raidz2.
>
> We do have 3 random SCSI timeouts, which were seen by FreeBSD, and thus
> should have prompted ZFS do handle the errors, and one read error on
> data, which is not reported as read error in any log, other than disk's
> SMART info says so.
>
> Is it possible to get more information, which block of the file is
> corrupt, on which disks the offending blocks reside, and similar
> information?
>
>
>
>
>
> _______________________________________________
> freebsd-fs at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe at freebsd.org"
More information about the freebsd-fs
mailing list