Data corruption with checksum offloading enabled
Arno J. Klaassen
arno at heho.snv.jussieu.fr
Tue Jan 27 04:46:07 PST 2009
Hello,
Dmitry Marakasov <amdmi3 at amdmi3.ru> writes:
> For now I have two cases of corruption - in both cases it is single
> difference of one 128 byte block with file offsets 0x65F872 and
> 0x61A072.
I had a similar problem last April on a 7-stable box reported
in a 'nfs-server silent data corruption' thread.
I found :
- in all failing cases just *one* byte is currupted, 4 or all 8 bits
set to zero *and* the original value is one out of the limited
subset {1, 8, 9} ....
here is the output of `cmp -x $i/BIG $i/BIG2` for some failing
cases I saved :
03869a48 09 00
05209d88 09 00
01777148 09 00
00f10f88 09 00
01f4c4c8 11 00
06c3d6c8 11 00
0725ca48 18 00
01608008 09 00
00f3b888 18 00
07aa45c8 29 20
Does your corruption fulfill these characterisations as well?
> I was suggested by Andrzej Tobola to try disabling txcsum on a
> network interface. I've disabled both rxcsum and txcsum, and that
> solved a problem.
>
> Judging from that this helped Andrzej with sk(4) and me with ale(4)
> driver, that's not a single driver problem. Does his mean that we
> have global problems with checksum offloading?
I could reproduce it with nfe(4) and re(4) ...
interestingly enough, I could *not* reproduce it when disabling
cpu frequency control ...
for what it's worth
Best, Arno
More information about the freebsd-current
mailing list