ZFS RAID 0+1 Throwing Checksum Errors

Tim Gustafson tjg at ucsc.edu
Mon Nov 9 19:08:58 UTC 2015


I have a FreeBSD 10.1 server configured as root-on-zfs with the
following pool configuration:

NAME            STATE     READ WRITE CKSUM
tank           ONLINE       0     0     0
 mirror-0      ONLINE       0     0     0
   gpt/zfs0    ONLINE       0     0     0
   gpt/zfs1    ONLINE       0     0     0
 mirror-1      ONLINE       0     0     0
   gpt/zfs2    ONLINE       0     0     0
   gpt/zfs3    ONLINE       0     0     0

The disks are each 1TB Samsung 850EVO SSDs connected via an mrsas Dell
Perc raid controller configured in "RAID Disabled" mode.

I run a "zpool scrub" every weekend and every weekend the scrub finds
a handful (usually between 1 and 10) checksum errors per disk.  The
scrub fixes the checksum errors, and I clear the counters and
everything seems fine.  As far as I know, I do not have any corrupt or
missing data.

The server is a fairly busy web and database server, handling about 5
million hits per day.

I'm wondering if the problem is that the scrub is calculating the
checksum for the data on gpt/zfs0, and while that's happening, some
data is updated by Apache or MySQL, and then checksum for the data on
gpt/zfs1 is calculated, which now doesn't match, and therefore the
scrub is reporting an error.  Is that possible?

If that's not it, could this be a bug?  Or should I be worried about
my SSDs?  What additional data would be helpful for me to share to
diagnose this?

-- 

Tim Gustafson
Technical Lead, Baskin School of Engineering
tjg at ucsc.edu
831-459-5354
Baskin Engineering, Room 313A


More information about the freebsd-fs mailing list