possible NFS lockups
krad
kraduk at googlemail.com
Tue Jul 27 15:29:22 UTC 2010
I have a production mail system with an nfs backend. Every now and again we
see the nfs die on a particular head end. However it doesn't die across all
the nodes. This suggests to me there isnt an issue with the filer itself and
the stats from the filer concur with that.
The symptoms are lines like this appearing in dmesg
nfs server 10.44.17.138:/vol/vol1/mail: not responding
nfs server 10.44.17.138:/vol/vol1/mail: is alive again
trussing df it seems to hang on getfsstat, this is presumably when it tries
the nfs mounts
eg
__sysctl(0xbfbfe224,0x2,0xbfbfe22c,0xbfbfe230,0x0,0x0) = 0 (0x0)
mmap(0x0,1048576,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) =
1746583552 (0x681ac000)
mmap(0x682ac000,344064,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANON,-1,0x0) =
1747632128 (0x682ac000)
munmap(0x681ac000,344064) = 0 (0x0)
getfsstat(0x68201000,0x1270,0x2,0xbfbfe960,0xbfbfe95c,0x1) = 9 (0x9)
I have played with mount options a fair bit but they dont make much
difference. This is what they are set to at present
10.44.17.138:/vol/vol1/mail /mail/0 nfs
rw,noatime,tcp,acdirmax=320,acdirmin=180,acregmax=320,acregmin=180 0 0
When this locking is occuring I find that if I do a show mount or mount
10.44.17.138:/vol/vol1/mail again under another mount point I can access it
fine.
One thing I have just noticed is that lockd and statd always seem to have
died when this happens. Restarting does not help
I find all this a bit perplexing. Can anyone offer any help into why this
might be happening. I have dtrace compliled into the kernel if that could
help with debugging
More information about the freebsd-questions
mailing list