Re: FreeBSD 13.2 NFS client mount hangs

Reply: J David : "Re: FreeBSD 13.2 NFS client mount hangs"
In reply to: J David : "FreeBSD 13.2 NFS client mount hangs"
Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Sat, 30 Sep 2023 22:06:35 UTC
On Fri, Sep 29, 2023 at 5:50 PM J David <j.david.lists@gmail.com> wrote:
>
> I have noticed a new (to me) hang on FreeBSD NFS client machines
> running 13.2-RELEASE-p2.
>
> It's happened twice this week to Apache processes.  It's the root EUID
> process and it appears to happen while the process is starting up or
> reconfiguring.  I.e., while it's reading the configs.
>
> The configs are not on NFS storage.  But the vhost document roots are.
>
> The process ps looks like this:
>
>     0 19557 19548  3  25  5  25248 12036 nfstry   DN    -      0:12.85
> /usr/local/apache/2.4/bin/httpd -D FOREGROUND -f
> /usr/local/apache/2.4/conf/httpd.conf
>
> The procstat -kk looks like:
>
>   PID    TID COMM                TDNAME              KSTACK
> 19557 100341 httpd               -                   mi_switch+0xc2
> sleepq_timedwait+0x2f _sleep+0x1ce clnt_vc_call+0x866
> clnt_reconnect_call+0x626 newnfs_request+0xc36 nfscl_request+0x5a
> nfsrpc_getattr+0xbb nfs_close+0x489 vop_sigdefer+0x2b
> VOP_CLOSE_APV+0x1c vn_close1+0x16a vn_closefile+0x3d _fdrop+0x11
> closef+0x24b closefp_impl+0x69 amd64_syscall+0x10c
> fast_syscall_common+0xf8
This is just waiting for a reply for the Close RPC.

>
> The process slowly gains CPU time (a few hundredths per minute) but is
> immune to kill -9 so it doesn't seem to be coming out of the kernel at
> any point.
>
> I tried running procstat -kk every few seconds to see if I would get
> anything different to show what it's doing. Most are the same as
> above, but I also got this:
>
> 19557 100341 httpd               -                   mi_switch+0xc2
> sleepq_timedwait+0x2f _sleep+0x1ce nfs_catnap+0x47
> newnfs_request+0x14b3 nfscl_request+0x5a nfsrpc_getattr+0xbb
> nfs_close+0x489 vop_sigdefer+0x2b VOP_CLOSE_APV+0x1c vn_close1+0x16a
> vn_closefile+0x3d _fdrop+0x11 closef+0x24b closefp_impl+0x69
> amd64_syscall+0x10c fast_syscall_common+0xf8
This one is sleeping for a short time before retrying an RPC. Although I
cannot be 100% sure, it is probably one that received a NFS4ERR_DELAY
reply from the server.

Fairly recent versions of the Linux server hand out delegations.  Imho
delegations are pretty useless.  I have seen reports on the
linux-nfs@vger.kernel.org
related to Close and Delegation Recall resulting in repeated NFS4ERR_DELAY
replies.
--> I'd suggest you try and disable delegations.  I do not know how to
do this on
     the Linux server, but not running the nfscbd(8) daemon should stop them
     from being issued. (No nfscbd(8) implies no callbacks and no
callbacks should
     imply no delegations being issued.  If the Linux server still
issues delegations
     when the nfscbd(8) is not running (and was not running when the
mount was done),
     it is broken.

The FreeBSD client currently does not accept NFS4ERR_DELAY for Close.  If the
Linux server is replying NFS4ERR_DELAY for Close, all bets are off.

>
> (This differs starting at the newnfs_request after nfscl_request+0x5a.)
>
> I started unmounting NFS filesystems until I hit one where umount
> hung.  An ls on that filesystem also hung. However, an ls of that
> filesystem from another client machine worked fine, so it does appear
> to be a client-side issue rather than a server problem.  umount -f
> also hung.  umount -N did unmount it very quickly and that caused all
> the hanging umounts and the
> httpd process to exit immediately.
Yes, "umount -N <mnt_path> is the way (and the only way) to get rid of
hung NFS mounts.

>
> I didn't find anything good in the syslog or dmesg. The only thing
> related to nfs are a handful of "nfsv4 err=10068" that look like they
> were way back near when the system booted (about 5 days ago).
Hmm, interesting. 10068 is NFS4ERR_RETRY_UNCACHED_REP.
I have never seen (and do not recall anyone else reporting) this error
return.
- The RFC says it can be replied when a retry of the same RPC with the
  same session slot/seqid is received and the reply is not cached.
To be honest, I do not know how the FreeBSD NFSv4.1/4.2 client will
handle this? I will have to look at the code to see if this can happen after
a new TCP connection is established and outstanding RPCs are retried.
--> If this can happen, the client code needs to be patched to retry the RPC
      with a different session slot or same session slot, but adavnced seqid.
Basically, I suspect the FreeBSD client is broken for handling this case,
which I have never seen before.

>
> The mount flags are:
>
> nfsv4,minorversion=2,oneopenown,tcp,resvport,nconnect=1,hard,cto,sec=sys,acdirmin=3,acdirmax=60,acregmin=5,acregmax=60,nametimeo=60,negnametimeo=60,rsize=65536,wsize=65536,readdirsize=65536,readahead=1,wcommitsize=16777216,timeout=120,retrans=2147483647
>
> Is there any other information I could provide or try to catch next
> time that would help debug this?
I'd suggest you first check network connetcivity. Both NFS client and NFS
server should be able to ping each other.
If that is the case, then I'd suggest you capture packets. On the FreeBSD
end:
# tcpdump -s 0 -w out.pcap host <nfs-server-name>
Let this run for a while and then pull out.pcap into wireshark and see what
traffic is going between the NFS client and server.
(Unlike tcpdump, wireshark does know how to decode NFS properly.)

rick

>
> Thanks!
>