Re: FreeBSD 12.3/13.1 NFS client hang

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Sat, 28 May 2022 16:00:07 UTC
Andreas Kempe <kempe@lysator.liu.se> wrote:
> On Fri, May 27, 2022 at 08:59:57PM +0000, Rick Macklem wrote:
> > Andreas Kempe <kempe@lysator.liu.se> wrote:
> > > Hello everyone!
> > >
> > > I'm having issues with the NFS clients on FreeBSD 12.3 and 13.1
> > > systems hanging when using a CentOS 7 server.
Here are a few other things to consider:
Delegations - They are complex and seldom improve performance.
       I think I finally have them implemented reliably, but???
       They are disabled by default in the FreeBSD server and can be
       avoided by not running the nfscbd(8) daemon when mounting
       non-FreeBSD NFS servers.
       # nfsstat -E -c
       - If it shows non-zero "Delegs", consider disabling them.

TSO- Some net chips/drivers don't get these quite right. NFS is very
      good at finding the flaws, since it generates all kinds of small and
      weird sized TSO/TCP segments.
      - Consider trying disabling TSO if intermittent hangs persist.

Jumbo mbuf clusters - Some net interfaces use jumbo mbuf clusters
      when jumbo frames are in use.  These can fragment the memory
      pool that mbuf clusters are being allocated from.
      # vmstat -z | fgrep mbuf_jumbo
      - and look to see if the third numbers are non-zero.
      Reducing the mtu may be a performance hit, but if the memory
      pool that clusters are allocated from becomes too fragmented,
      NFS will come to a grinding halt.

An NFSv4 server that does not reply to an RPC. This is a badly broken
server. NFSv4 servers are supposed to reply NFSERR_DELAY if they cannot
do an RPC at the time requested. They are not supposed to throw away
the request without replying.
Hopefully, such servers do not exist. If they do, the mount will hang.
About the only way to detect this would be a packet capture when it
happens.
About the only fix is a different NFS server or using NFSv3 mounts, which
are stateless and might work better in this case.

rick


> First, make sure you are using hard mounts. "soft" or "intr" mounts won't
> work and will mess up the session sooner or later. (A messed up session could
> result in no free slots on the session and that will wedge threads in
> nfsv4_sequencelookup() as you describe.
> (This is briefly described in the BUGS section of "man mount_nfs".)
>

I had totally missed that soft and interruptible mounts have these
issues. I switched the FreeBSD-machines to soft and intr on purpose
to be able to fix hung mounts without having to restart the machine on
NFS hangs. Since they are shared machines, it is an inconvinience for
other users if one user causes a hang.

Switching our test machine back to hard mounts did prevent recursive
grep from immediately causing the slot type hang again.

> Do a:
> # nfsstat -m
> on the clients and look for "hard".
>
> Next, is there anything logged on the console for the 13.1 client(s)?
> (13.1 has some diagnostics for things like a server replying with the
>  wrong session slot#.)
>

The one thing we have seen logged are messages along the lines of:
kernel: newnfs: server 'mail' error: fileid changed. fsid 4240eca6003a052a:0: expected fileid 0x22, got 0x2. (BROKEN NFS SERVER OR MIDDLEWARE)

> Also, maybe I'm old fashioned, but I find "ps axHl" useful, since it shows
> where all the processes are sleeping.
> And "procstat -kk" covers all of the locks.
>

I don't know if it is a matter of being old fashioned as much as one
of taste. :) In future dumps, I can provide both ps axHl and procstat -kk.

> > Below are procstat kstack $PID invocations showing where the processes
> > have hung. In the nfsv4_sequencelookup it seems hung waiting for
> > nfsess_slots to have an available slot. In the second nfs_lock case,
> > it seems the processes are stuck waiting on vnode locks.
> >
> > These issues seem to appear seemingly at random, but also if
> > operations that open a lot of files or create a lot of file locks are
> > used. An example that can often provoke a hang is performing a
> > recursive grep through a large file hierarchy like the FreeBSD
> > codebase.
> >
> > The NFS code is large and complicated so any advice is appriciated!
> Yea. I'm the author and I don't know exactly what it all does;-)\
>
> > Cordially,
> > Andreas Kempe
> >
>
> [...]
>
> Not very useful unless you have all the processes and their locks to try and figure out what is holding
> the vnode locks.
>

Yes, I sent this mostly in the hope that it might be something that
someone has seen before. I understand that more verbose information is
needed to track down the lock contention.

I'll switch our machines back to using hard mounts and try to get as
much diagnostic information as possible when the next lockup happens.

Do you have any good suggestions for tracking down the issue? I've
been contemplating enabling WITNESS or building with debug information
to be able to hook in the kernel debugger.

Thank you very much for your reply!
Cordially,
Andreas Kempe

> rick
>
>