Analysis of disk file block with ZFS checksum error
Mark Day
mday at apple.com
Fri Feb 8 23:07:22 UTC 2008
On Feb 8, 2008, at 2:29 PM, Joe Peterson wrote:
> For one thing (as I mentioned), only 65536 bytes are bad (and it's
> exactly this many, with a few "good" bytes thrown in, but not far from
> what matches random chance would produce. Also, all bad bytes have a
> zero in the high bit - interesting? Also, near the end of the block,
> the bad bytes all go to zero, strangely coincident with the first
> "good"
> zero in that bad block - not sure if that's coincidence or not.
> Also, I
> calculated the number of "Bits same" (matching bits) in the good vs.
> bad
> bytes, and it appears fairly random, so it appears that the bad bytes
> are very random in nature and not correlated much at all with the good
> bytes.
>
> So except for the fact that the 2nd half (65536 bytes) of the ZFS
> block
> are good, the bad block seems to consist of random data, except for
> the
> string of zero bytes near the end and the zero high-bit. It's not
> as if
> one bit on the disk flipped - it affects the whole (1/2) block. Does
> this seem like a disk error, controller error/bug, cable problem (I
> recently put a new cable on, so I doubt this). It seems to me
> something
> more systemic rather than a random bit error - opinions are more than
> welcome.
Based on the subset of data you posted, the bad data looks like ASCII
text.
The bad data from offset a0000 to a000f is:
${138AFE{@
@$$}1
The bad data from offset af6c1 to af6c8 is:
392A9}@
I don't recognize the content beyond that, but I'd guess that somehow
the
contents of some other file managed to overwrite that portion of the bad
file. As for how that happened, I don't know. But if someone
recognizes
where the bad content came from, that might be a clue.
-Mark
More information about the freebsd-stable
mailing list