ZFS raidz2, errors in file?

Wed Oct 17 17:39:52 UTC 2012

Summary:  This could indicate that in some situations ZFS or FreeBSD 
ignore disk errors in some situations, or disks do not pass them to 
FreeBSD.

There are three SCSI timeouts all on different disks reported, though it 
seems unlikely they hit all three redundancies of a single block of 
single file, so the more likely scenario would be that there are other 
errors causing this, which are not reported or are ignored.  Also, 
timeouts should trigger retries.

SMART data indicates problems on two other disks, but no indication of 
those are seen in logs (the disks work, but SMART information indicates 
problems).  The fs is 8*3T pool on raidz2.

Further data below:

------------------------------------------------------------------------
zpool status -v
   pool: raid6
  state: ONLINE
status: One or more devices has experienced an error resulting in data
         corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
         entire pool from backup.
    see: http://www.sun.com/msg/ZFS-8000-8A
  scan: none requested
config:

         NAME        STATE     READ WRITE CKSUM
         raid6       ONLINE       0     0     0
           raidz2-0  ONLINE       0     0     0
             da0     ONLINE       0     0     0
             da1     ONLINE       0     0     0
             da2     ONLINE       0     0     0
             da3     ONLINE       0     0     0
             da4     ONLINE       0     0     0
             da5     ONLINE       0     0     0
             da6     ONLINE       0     0     0
             da7     ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

         /raid6/backup/rsnapshot/daily.1/foobar.net/users/mail/foo
------------------------------------------------------------------------

All disks online, so this is kind of odd.

One disk indeed has pending sector, not unusual and should be survivable:

------------------------------------------------------------------------
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE 
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail 
Always       -       0
   3 Spin_Up_Time            0x0027   162   154   021    Pre-fail 
Always       -       8891
   4 Start_Stop_Count        0x0032   100   100   000    Old_age 
Always       -       46
   5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail 
Always       -       0
   7 Seek_Error_Rate         0x002e   100   253   000    Old_age 
Always       -       0
   9 Power_On_Hours          0x0032   094   094   000    Old_age 
Always       -       4776
  10 Spin_Retry_Count        0x0032   100   253   000    Old_age 
Always       -       0
  11 Calibration_Retry_Count 0x0032   100   253   000    Old_age 
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age 
Always       -       43
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always 
       -       40
193 Load_Cycle_Count        0x0032   196   196   000    Old_age   Always 
       -       13595
194 Temperature_Celsius     0x0022   119   106   000    Old_age   Always 
       -       33
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always 
       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always 
       -       1
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age 
Offline      -       1
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always 
       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age 
Offline      -       1

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]
------------------------------------------------------------------------

But there is no corresponding read error in /var/log/messages, which 
would indicate that FreeBSD never saw the error, or ignored it.

In addition, there seems to be ICRC DMA errors on da0.  Looks nasty, but 
only show up in SMART log, not in /var/log/messages.

------------------------------------------------------------------------
smartctl 5.43 2012-06-30 r3573 [FreeBSD 9.0-RELEASE-p3 amd64] (local build)
Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi Deskstar 7K3000
Device Model:     Hitachi HDS723030ALA640
Serial Number:    MK0331YHGVDK6A
LU WWN Device Id: 5 000cca 225cc00c7
Firmware Version: MKAOA5C0
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 4
Local Time is:    Wed Oct 17 20:03:05 2012 EEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                         was suspended by an 
interrupting command from host.
                                         Auto Offline Data Collection: 
Enabled.
Self-test execution status:      (   0) The previous self-test routine 
completed
                                         without error or no self-test 
has ever
                                         been run.
Total time to complete Offline
data collection:                (29509) seconds.
Offline data collection
capabilities:                    (0x5b) SMART execute Offline immediate.
                                         Auto Offline data collection 
on/off support.
                                         Suspend Offline collection upon new
                                         command.
                                         Offline surface scan supported.
                                         Self-test supported.
                                         No Conveyance Self-test supported.
                                         Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                         power-saving mode.
                                         Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                         General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 492) minutes.
SCT capabilities:              (0x003d) SCT Status supported.
                                         SCT Error Recovery Control 
supported.
                                         SCT Feature Control supported.
                                         SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE 
UPDATED  WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate     0x000b   100   100   016    Pre-fail 
Always       -       0
   2 Throughput_Performance  0x0005   136   136   054    Pre-fail 
Offline      -       83
   3 Spin_Up_Time            0x0007   125   125   024    Pre-fail 
Always       -       617 (Average 617)
   4 Start_Stop_Count        0x0012   100   100   000    Old_age 
Always       -       62
   5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail 
Always       -       0
   7 Seek_Error_Rate         0x000b   100   100   067    Pre-fail 
Always       -       0
   8 Seek_Time_Performance   0x0005   135   135   020    Pre-fail 
Offline      -       26
   9 Power_On_Hours          0x0012   100   100   000    Old_age 
Always       -       4570
  10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail 
Always       -       0
  12 Power_Cycle_Count       0x0032   100   100   000    Old_age 
Always       -       30
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always 
       -       129
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always 
       -       129
194 Temperature_Celsius     0x0002   166   166   000    Old_age   Always 
       -       36 (Min/Max 25/50)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always 
       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always 
       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age 
Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always 
       -       112

SMART Error Log Version: 1
ATA Error Count: 112 (device log contains only the most recent five errors)
         CR = Command Register [HEX]
         FR = Features Register [HEX]
         SC = Sector Count Register [HEX]
         SN = Sector Number Register [HEX]
         CL = Cylinder Low Register [HEX]
         CH = Cylinder High Register [HEX]
         DH = Device/Head Register [HEX]
         DC = Device Command Register [HEX]
         ER = Error register [HEX]
         ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 112 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   84 51 20 28 c5 ae 06  Error: ICRC, ABRT 32 sectors at LBA = 
0x06aec528 = 112117032

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   35 03 f0 58 c4 ae 40 00   2d+04:00:39.416  WRITE DMA EXT
   35 03 20 38 c4 ae 40 00   2d+04:00:39.409  WRITE DMA EXT
   35 03 f0 48 c3 ae 40 00   2d+04:00:39.402  WRITE DMA EXT
   35 03 f0 58 c2 ae 40 00   2d+04:00:39.394  WRITE DMA EXT
   35 03 60 f8 c1 ae 40 00   2d+04:00:39.388  WRITE DMA EXT

Error 111 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   84 51 30 08 c4 90 06  Error: ICRC, ABRT 48 sectors at LBA = 
0x0690c408 = 110150664

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   35 03 f0 48 c3 90 40 00   2d+03:59:29.925  WRITE DMA EXT
   35 03 f0 58 c2 90 40 00   2d+03:59:29.917  WRITE DMA EXT
   35 03 60 f8 c1 90 40 00   2d+03:59:29.910  WRITE DMA EXT
   35 03 b0 48 c1 90 40 00   2d+03:59:29.907  WRITE DMA EXT
   35 03 f0 58 c0 90 40 00   2d+03:59:29.899  WRITE DMA EXT

Error 110 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   84 51 20 28 47 90 05  Error: ICRC, ABRT 32 sectors at LBA = 
0x05904728 = 93341480

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   35 da f0 58 46 90 40 00   2d+03:49:28.919  WRITE DMA EXT
   35 da 20 38 46 90 40 00   2d+03:49:28.917  WRITE DMA EXT
   35 da f0 48 45 90 40 00   2d+03:49:28.909  WRITE DMA EXT
   35 da f0 58 44 90 40 00   2d+03:49:28.901  WRITE DMA EXT
   35 da 20 38 44 90 40 00   2d+03:49:28.899  WRITE DMA EXT

Error 109 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   84 51 10 38 e7 41 05  Error: ICRC, ABRT 16 sectors at LBA = 
0x0541e738 = 88205112

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   35 03 f0 58 e6 41 40 00   2d+03:46:20.939  WRITE DMA EXT
   35 03 20 38 e6 41 40 00   2d+03:46:20.933  WRITE DMA EXT
   35 03 f0 48 e5 41 40 00   2d+03:46:20.926  WRITE DMA EXT
   35 03 f0 58 e4 41 40 00   2d+03:46:20.921  WRITE DMA EXT
   35 03 20 38 e4 41 40 00   2d+03:46:20.916  WRITE DMA EXT

Error 108 occurred at disk power-on lifetime: 52 hours (2 days + 4 hours)
   When the command that caused the error occurred, the device was 
active or idle.

   After command completion occurred, registers were:
   ER ST SC SN CL CH DH
   -- -- -- -- -- -- --
   84 51 10 28 0a 2f 05  Error: ICRC, ABRT 16 sectors at LBA = 
0x052f0a28 = 86968872

   Commands leading to the command that caused the error were:
   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
   -- -- -- -- -- -- -- --  ----------------  --------------------
   35 03 f0 48 09 2f 40 00   2d+03:45:06.250  WRITE DMA EXT
   35 03 f0 58 08 2f 40 00   2d+03:45:06.242  WRITE DMA EXT
   35 03 20 38 08 2f 40 00   2d+03:45:06.238  WRITE DMA EXT
   35 03 f0 48 07 2f 40 00   2d+03:45:06.231  WRITE DMA EXT
   35 03 08 40 bc 25 40 00   2d+03:45:06.225  WRITE DMA EXT

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
  SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
     1        0        0  Not_testing
     2        0        0  Not_testing
     3        0        0  Not_testing
     4        0        0  Not_testing
     5        0        0  Not_testing
Selective self-test flags (0x0):
   After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
------------------------------------------------------------------------

These only show up in disk's SMART log, not as errors in other logs 
(logs exists for all the history of the system).  These do not seem to 
have passed to the FreeBSD, or FreeBSD has ignored the errors.

 From logs, I can find da3, da4, da6 all have reported scsi timeout once 
(different times, within one month in July.

They all look like this, one case for each of the disks:

------------------------------------------------------------------------
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 699
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 419
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 359
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 472
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 124
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 762
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 243
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 890
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 204
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 2
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI command timeout on 
device handle 0x000d SMID 291
Jul 24 12:16:55 tila kernel: mps0: (0:6:0) terminated ioc 804b scsi 0 
state c xfer 0
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 699 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 419
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 419 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 359
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 359 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 472
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 472 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 124
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 124 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 762
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 762 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 243
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 243 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 890
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 890 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 204
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 204 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 2
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 2 complete
Jul 24 12:16:55 tila kernel: mps0: mpssas_complete_tm_request: sending 
deferred task management request for handle 0x0d SMID 291
Jul 24 12:16:55 tila kernel: mps0: mpssas_abort_complete: abort request 
on handle 0x0d SMID 291 complete
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): WRITE(16). CDB: 8a 0 0 0 
0 1 4a 17 dc 5e 0 0 0 57 0 0
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): CAM status: SCSI Status Error
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI status: Check Condition
Jul 24 12:16:55 tila kernel: (da6:mps0:0:6:0): SCSI sense: UNIT 
ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): WRITE(10). CDB: 2a 0 28 0 
85 fb 0 0 1 0
Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): CAM status: SCSI Status Error
Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): SCSI status: Check Condition
Jul 24 12:16:56 tila kernel: (da6:mps0:0:6:0): SCSI sense: UNIT 
ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
------------------------------------------------------------------------

If the da0 ICRC errors would have been seen by ZFS, it should have made 
a) note of that in some log?  b) retried write?  c) Something else?  If 
we assume that the disk firmware is broken and does not report these to 
OS, so da0 might be corrupt.  But that should still be ok with raidz2.

We do have 3 random SCSI timeouts, which were seen by FreeBSD, and thus 
should have prompted ZFS do handle the errors, and one read error on 
data, which is not reported as read error in any log, other than disk's 
SMART info says so.

Is it possible to get more information, which block of the file is 
corrupt, on which disks the offending blocks reside, and similar 
information?