nfs-server silent data corruption
Arno J. Klaassen
arno at heho.snv.jussieu.fr
Mon Apr 21 14:54:35 UTC 2008
Kris Kennaway <kris at FreeBSD.ORG> writes:
> On Mon, Apr 21, 2008 at 01:02:33AM +0200, Arno J. Klaassen wrote:
>
> > I didn't stress-test this MB for a while, but last time I did was
> > with 7-PRELEASE/RC?/CANTremember-exactly-but-close-to-release
> > and all worked great
> >
> > I did add 2G ECC to the 2nd CPU since, though I doubt that interferes
> > with NFS.
>
> Uh, you're getting server-side data corruption, it could definitely be
> because of the memory you added.
yop, though I'm still not convinced the memory is bad (the very same
Kingston ECC as the 2*1G in use for about half a year already) :
I added it directly to the 2nd CPU (diagram on page 9 of
http://www.tyan.com/manuals/m_s2895_101.pdf) and the problem
seems to be the interaction between nfe0 and powerd .... :
- if I stop powerd, problems go away
- I let run powerd but turn of txcsum and tso4 on the interface,
the problem is a lot harder to produce (if ever this gives
a hint to anyone)
Device is :
nfe0 at pci0:0:10:0: class=0x068000 card=0x289510f1 chip=0x005710de rev=0xa3 hdr=0x00
vendor = 'Nvidia Corp'
device = 'nForce4 Ultra NVidia Network Bus Enumerator'
class = bridge
cap 01[44] = powerspec 2 supports D0 D1 D2 D3 current D0
(this is with the default BIOS setting " LAN Bridge Enabled", disabling
that setting makes pciconf say "class = network" but does not influence
my problem)
I will restart my tests now by populating all 4G to only CPU1 and
say whether that matters.
Best, Arno
More information about the freebsd-net
mailing list