"nfs server not responding, timed out" with Linux client and 13.0-RELEASE server
Date: Tue, 01 Jun 2021 21:20:05 UTC
I have several Linux NFS clients (Debian, with various 4.9 kernels) mounting NFS 4.0 from both Solaris and FreeBSD (mostly 12.2-RELEASE) servers. I frequently see kernel messages in the clients about "nfs: server <HOSTNAME> not responding, timed out" for both the FreeBSD and Solaris servers. It usually recovers after a few seconds. However, on the sole 13.0-RELEASE server, the hangs last for minutes to hours. tcpdump/wireshark shows the NFS client sending a "V4 Call SEQUENCE" packet every 5-10 seconds, all with the same sessionid. The server responds with a TCP ACK, but sends no NFS-level response. I've seen this with both NFS 4.0 and 4.2. NFS delegations are disabled, and nfsdumpstate shows that the client has callbacks enabled, with 2-4 opens, openowners, locks, and lockowners. Can anybody suggest how to fix or further debug this problem? -Alan