Re: FreeBSD 12.3/13.1 NFS client hang

From: Rick Macklem <rmacklem_at_uoguelph.ca>
Date: Mon, 30 May 2022 14:51:32 UTC
Andreas Kempe <kempe@lysator.liu.se> wrote:
[lots of stuff snipped]
>
> I guess this means you think the error is at a protocol handling level
> and the issues aren't caused by locking issues in the code? I was
> wondering whether the hangs that were not slot related could possibly
> be due to some race condition when locking since it happens so
> seemingly randomly.
Anything is possible, but the locking is pretty straightforward and no one
has found a bug in it for ages. (You can certainly run a kernel with
WITNESS, DEBUG_VFS_LOCKS, etc., but there will be a performance
penalty.

My experience is that most hangs (other than the business with sessions
for soft or intr mounts) are caused by network fabric issues.
A couple of examples:
As I noted, having TSO fail for some specific segment. Then retransmit of
the segment fails again, and again... 

In 13.0, there was a bug in TCP that
caused the receive socket upcall to not happen under certain circumstances
and that could cause a hang. The bug is not in 12.n or 13.1 and the hang
was normally observed when a Linux client had a FreeBSD server mounted,
not vise versa.

After a network partitioning healed, Linux and FreeBSD would get into
what I might call an "RST storm". Every time one end would try to
establish a new TCP connection, the other end would RST it.
(Sorry, but it has been a while and I cannot remember exactly how to cause it
 or if it even got resolved?)

> > If you can reproduce it for a hard mount, you could capture packets via:
> > # tcpdump -s 0 -w out.pcap host <nfs-server>
> > Tcpdump is useless at decoding NFS, but wireshark can decode the out.pcap
> > quite nicely. I can look at the out.pcap or, if you do so, you start by looking for
> > NFSv4 specific errors.
> > --> The client will usually log if it gets one of these. It will be an error # > 10000.
> >
> 
> With us not knowing the NFSv4 protocol, we were holding off on even
> trying to get Wireshark dumps since we wouldn't know what to look for
> and would have to learn the protocol first. You having a look would be
> greatly appreciated! As I wrote above, I'll try to get dumps if we can
> find a reproducer.
I certainly don't mind looking, but you might be surprised at how good
wireshark is at this stuff.
It not onlt decodes the RPCs for you, it flags anything that looks "sketchy"
in yellow and anything obviously broken in red.
It was wireshark that spotted and flagged the RSTs I mentioned above.
Beyond that, you just try and get to the place where things broke (a hang
might be at the end of the capture, for example) and then work backwards.
It is true that you need to know the protocol to spot things other than
server error returns that are not going as planned.

The big challenge is getting the packet capture that is less than petabytes
in size. Although starting a packet capture after a hang has occurred can
be useful, it is usually too late, since the breakage has already happened.

rick

> Good luck with it, rick
> > rick
> >

Cordially,
Andreas Kempe