Why is NFSv4 so slow?
Rick C. Petty
rick-freebsd2009 at kiwi-computer.com
Mon Jun 28 14:00:56 UTC 2010
On Mon, Jun 28, 2010 at 12:30:30AM -0400, Rick Macklem wrote:
>
> I can't explain the corruption, beyond the fact that "soft,intr" can
> cause all sorts of grief. If mounts without "soft,intr" still show
> corruption problems, try disabling delegations (either kill off the
> nfscbd daemons on the client or set vfs.newnfs.issue_delegations=0
> on the server). It is disabled by default because it is the "greenest"
> part of the subsystem.
I tried without soft,intr and "make buildworld" failed with what looks like
file corruption again. I'm trying without delegations now.
> Make sure you don't have multiple entries for the same uid, such as "root"
> and "toor" both for uid 0 in your /etc/passwd. (ie. get rid of one of
> them, if you have both)
Hmm, that's a strange requirement, since FreeBSD by default comes with
both. That should probably be documented in the nfsv4 man page.
> When you specify "nfs" for an NFSv3 mount, you get the regular client.
> When you specify "newnfs" for an NFSv3 mount, you get the experimental
> client. When you specify "nfsv4" you always get the experimental NFS
> client, and it doesn't matter which FStype you've specified.
Ok. So my comparison was with the regular and experimental clients.
> If you are using UFS/FFS on the server, this should work and I don't know
> why the empty directories under /vol on the client confused it. If your
> server is using ZFS, everything from / including /vol need to be exported.
Nope, UFS2 only (on both clients and server).
> > kernel: nfsv4 client/server protocol prob err=10020
>
> This error indicates that there wasn't a valid FH for the server. I
> suspect that the mount failed. (It does a loop of Lookups from "/" in
> the kernel during the mount and it somehow got confused part way through.)
If the mount failed, why would it allow me to "ls /vol/a" and see both "b"
and "c" directories as well as other files/directories on /vol/ ?
> I don't know why these empty dirs would confuse it. I'll try a test
> here, but I suspect the real problem was that the mount failed and
> then happened to succeed after you deleted the empty dirs.
It doesn't seem likely. I spent an hour mounting and unmounting and each
mount looked successful in that there were files and directories besides
the two I was trying to decend into.
> It still smells like some sort of transport/net interface/... issue
> is at the bottom of this. (see response to your next post)
It's possible. I just had another NFSv4 client (with the same server) lock
up:
load: 0.00 cmd: ls 17410 [nfsv4lck] 641.87r 0.00u 0.00s 0% 1512k
and:
load: 0.00 cmd: make 87546 [wait] 37095.09r 0.01u 0.01s 0% 844k
That make has been hung for hours, and the ls(1) was executed during that
lockup. I wish there was a way I could unhang these processes and unmount
the NFS mount without panicking the kernel, but alas even this fails:
# umount -f /sw
load: 0.00 cmd: umount 17479 [nfsclumnt] 1.27r 0.00u 0.04s 0% 788k
A "shutdown -p now" resulted in a panic with the speaker beeping
constantly and no console output.
It's possible the NICs are all suspect, but all of this worked fine a
couple of days ago when I was only using NFSv3.
-- Rick C. Petty
More information about the freebsd-stable
mailing list