Forcing full file read in ZFS even when checksum error
encountered
Joe Peterson
joe at skyrush.com
Wed Feb 6 16:42:53 UTC 2008
Bakul Shah wrote:
> It could also be a memory error of some sort. Does your
> system haev ECC memory?
Yes, I always insist on ECC.
> Also note that standalone tests do
> not seem to catch all sorts of errors that heavy use of Unix
> can sometimes trigger on a marginal system.
I do plan to do a few more HW checks (cables, etc.), just to make sure.
I had been avoiding touching my HW config to preserve the current state
of this issue. However, given the coincidental experience Jeremy talked
about and the fact that the DMA errors I have seen using ZFS on FreeBSD
that I do not see using ZFS-Fuse on the same disk/pool in Linux, I have
a gut feeling something funny is going on.
> But I agree with you that it would be useful to have a debug
> mode where you can get at the data even if it is bad (and a
> test mode where you can write bad data on purpose:-). [A
> long rant on writing testable code deleted]
Yes, the danger of course is if someone forget's that the debug mode is
engaged, but I think care could be taken to make sure this cannot easily
be done accidentally or massive warnings can be issues to make sure the
user knows.
> You have access to the zfs sources! At the very least you can
> add code to report the bad checksum & offset and see if
> matches with checksum of the same block(s) in your known good
> copy.
Yep, this is my next planned step.
Thanks, Joe
More information about the freebsd-fs
mailing list