Forcing full file read in ZFS even when checksum error encountered

Joe Peterson joe at skyrush.com
Wed Feb 6 16:42:53 UTC 2008


Bakul Shah wrote:
> It could also be a memory error of some sort.  Does your
> system haev ECC memory?

Yes, I always insist on ECC.

> Also note that standalone tests do
> not seem to catch all sorts of errors that heavy use of Unix
> can sometimes trigger on a marginal system.

I do plan to do a few more HW checks (cables, etc.), just to make sure.
 I had been avoiding touching my HW config to preserve the current state
of this issue.  However, given the coincidental experience Jeremy talked
about and the fact that the DMA errors I have seen using ZFS on FreeBSD
that I do not see using ZFS-Fuse on the same disk/pool in Linux, I have
a gut feeling something funny is going on.

> But I agree with you that it would be useful to have a debug
> mode where you can get at the data even if it is bad (and a
> test mode where you can write bad data on purpose:-). [A
> long rant on writing testable code deleted]

Yes, the danger of course is if someone forget's that the debug mode is
engaged, but I think care could be taken to make sure this cannot easily
be done accidentally or massive warnings can be issues to make sure the
user knows.

> You have access to the zfs sources! At the very least you can
> add code to report the bad checksum & offset and see if
> matches with checksum of the same block(s) in your known good
> copy.

Yep, this is my next planned step.

			Thanks, Joe




More information about the freebsd-fs mailing list