Re: NFSv4 hangs on 13.3

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Fri, 31 May 2024 21:10:34 UTC
On Fri, May 31, 2024 at 1:11 PM Rick Macklem <rick.macklem@gmail.com> wrote:
>
> On Fri, May 31, 2024 at 12:39 PM J David <j.david.lists@gmail.com> wrote:
> >
> > In an attempt to narrow down the problems we have with NFS v4.2, we've
> > set up a FreeBSD 13.3 server that mirrors some read-only data from our
> > Linux NFS servers.
> >
> > We are still observing occasional hangs on the FreeBSD 13.3 client machines.
> >
> > When these hangs occur, anything that touches the mountpoint
> > (including "umount -N") hangs indefinitely. There is nothing in dmesg.
> >
> > The only even slightly informative data I could gather is an "ls
> > /mount/point" kstack:
> >
> > $ sudo procstat -kk 94676
> >   PID    TID COMM                TDNAME              KSTACK
> > 94676 976876 ls                  -                   mi_switch+0xbf
> > sleeplk+0xea lockmgr_slock_hard+0x3a5 nfs_lock+0x29 vop_sigdefer+0x2a
> > _vn_lock+0x47 vfs_cache_root+0x9d vfs_root_sigdefer+0x35 lookup+0x88a
> > namei+0x24a kern_statat+0xf8 sys_fstatat+0x27 amd64_syscall+0x110
> > fast_syscall_common+0xf8
> >
> > During this time, the NFS server continues serving other clients, and
> > the affected client can access other NFS servers without hanging.
> >
> > The "nfsstat -m" for this mountpoint looks like this:
> >
> > nfsv4,minorversion=2,oneopenown,tcp,resvport,nconnect=1,hard,cto,nolockd,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2147483647
> >
> > with these mount flags in /etc/fstab:
> >
> > ro,nfsv4,minorversion=2,tcp,nosuid,noatime,nolockd,oneopenown
> >
> > Is there anything more we can do to help identify this issue?
> Collect the output of:
> # ps axHl
> and
> # procstat -kk -a
Oh, and
# netstat -a
to see what the TCP connections are up to.

rick

>
> What you show above is just a thread waiting for a lock on an NFS vnode.
> What we need to figure out is what thread is holding the lock on that vnode
> while waiting for something else.
>
> rick
>
> >
> > Thanks for any advice!
> >