Re: FreeBSD 12.3/13.1 NFS client hang

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Fri, 27 May 2022 23:56:32 UTC
Andreas Kempe <kempe@lysator.liu.se> wrote:
> On Fri, May 27, 2022 at 08:59:57PM +0000, Rick Macklem wrote:
> > Andreas Kempe <kempe@lysator.liu.se> wrote:
> > > Hello everyone!
> > >
> > > I'm having issues with the NFS clients on FreeBSD 12.3 and 13.1
> > > systems hanging when using a CentOS 7 server.
> > First, make sure you are using hard mounts. "soft" or "intr" mounts won't
> > work and will mess up the session sooner or later. (A messed up session could
> > result in no free slots on the session and that will wedge threads in
> > nfsv4_sequencelookup() as you describe.
> > (This is briefly described in the BUGS section of "man mount_nfs".)
> >
>
> I had totally missed that soft and interruptible mounts have these
> issues. I switched the FreeBSD-machines to soft and intr on purpose
> to be able to fix hung mounts without having to restart the machine on
> NFS hangs. Since they are shared machines, it is an inconvinience for
> other users if one user causes a hang.
Usually, a "umount -N <mnt_path>" should dismount a hung mount
point.  It can take a couple of minutes to complete.

> Switching our test machine back to hard mounts did prevent recursive
> grep from immediately causing the slot type hang again.
>
> > Do a:
> > # nfsstat -m
> > on the clients and look for "hard".
> >
> > Next, is there anything logged on the console for the 13.1 client(s)?
> > (13.1 has some diagnostics for things like a server replying with the
> >  wrong session slot#.)
> >
>
> The one thing we have seen logged are messages along the lines of:
> kernel: newnfs: server 'mail' error: fileid changed. fsid 4240eca6003a052a:0: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE)
It means that the server returned a different fileid number for the same file, although it should never change.
There's a description in a comment in sys/fs/nfsclient/nfs_clport.c.
I doubt the broken middleware is anywhere any more. I never knew the
details, since the guy that told me about it was under NDA to the
company that sold it. It cached Getattr replies and would sometimes return
the wrong cached entry. I think it only worked for NFSv3, anyhow.

However, it does indicate something is seriously wrong, probably on the server end.
(If you can capture packets when it gets logged, we could look at them in wireshark.)
--> I'm not sure if a soft mount could somehow cause this?

The diagnostics I was referring to would be things like "Wrong session" or "freeing free slot".
It was these that identified the Amazon EFS bug I mention later.

> > Also, maybe I'm old fashioned, but I find "ps axHl" useful, since it shows
> > where all the processes are sleeping.
> > And "procstat -kk" covers all of the locks.
> >
> 
> I don't know if it is a matter of being old fashioned as much as one
> of taste. :) In future dumps, I can provide both ps axHl and procstat -kk.
Ok. Lets see how things go with hard mounts.

> > > Below are procstat kstack $PID invocations showing where the processes
> > > have hung. In the nfsv4_sequencelookup it seems hung waiting for
> > > nfsess_slots to have an available slot. In the second nfs_lock case,
> > > it seems the processes are stuck waiting on vnode locks.
> > >
> > > These issues seem to appear seemingly at random, but also if
> > > operations that open a lot of files or create a lot of file locks are
> > > used. An example that can often provoke a hang is performing a
> > > recursive grep through a large file hierarchy like the FreeBSD
> > > codebase.
> > >
> > > The NFS code is large and complicated so any advice is appriciated!
> > Yea. I'm the author and I don't know exactly what it all does;-)\
> >
> > > Cordially,
> > > Andreas Kempe
> > >
> >
> > [...]
> >
> > Not very useful unless you have all the processes and their locks to try and figure out what is holding
> > the vnode locks.
> >
> 
> Yes, I sent this mostly in the hope that it might be something that
> someone has seen before. I understand that more verbose information is
> needed to track down the lock contention.
There is PR#260011. It is similar and he was also using soft mounts, although he is now trying
hard mounts. Also, we now know that the Amazon EFS server has a serious
bug where it sometimes replies with the wrong slotid.

> I'll switch our machines back to using hard mounts and try to get as
> much diagnostic information as possible when the next lockup happens.
>
> Do you have any good suggestions for tracking down the issue? I've
> been contemplating enabling WITNESS or building with debug information
> to be able to hook in the kernel debugger.
I don't think WITNESS or the kernel debugger will help.
Beyond what you get from "ps axHl", it has happened before the hang.
If you can reproduce it for a hard mount, you could capture packets via:
# tcpdump -s 0 -w out.pcap host <nfs-server>
Tcpdump is useless at decoding NFS, but wireshark can decode the out.pcap
quite nicely. I can look at the out.pcap or, if you do so, you start by looking for
NFSv4 specific errors.
--> The client will usually log if it gets one of these. It will be an error # > 10000.

Good luck with it, rick

Thank you very much for your reply!
Cordially,
Andreas Kempe

> rick
>
>