Re: nfs stalls client: nfsrv_cache_session: no session
- Reply: Peter : "Re: nfs stalls client: nfsrv_cache_session: no session"
- In reply to: Peter : "nfs stalls client: nfsrv_cache_session: no session"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 16 Jul 2022 13:43:11 UTC
Peter <pmc@citylink.dinoex.sub.org> wrote: > Hija, > I have a problem with NFSv4: > > The configuration: > Server Rel. 13.1-RC2 > nfs_server_enable="YES" > nfs_server_flags="-u -t --minthreads 2 --maxthreads 20 -h ..." Allowing it to go down to 2 threads is very low. I've never even tried to run a server with less than 4 threads. Since kernel threads don't generate much overhead, I'd suggest replacing the minthreads/maxthreads with "-n 32" for a very small server. (I didn't write the code that allows number of threads to vary and never use that either.) > mountd_enable="YES" > mountd_flags="-S -p 803 -h ..." > rpc_lockd_enable="YES" > rpc_lockd_flags="-h ..." > rpc_statd_enable="YES" > rpc_statd_flags="-h ..." > rpcbind_enable="YES" > rpcbind_flags="-h ..." > nfsv4_server_enable="YES" > sysctl vfs.nfs.enable_uidtostring=1 > sysctl vfs.nfsd.enable_stringtouid=1 > > Client bhyve Rel. 13.1-RELEASE on the same system > nfs_client_enable="YES" > nfs_access_cache="600" > nfs_bufpackets="32" > nfscbd_enable="YES" > > Mount-options: nfsv4,readahead=1,rw,async I would expect the behaviour you are seeing for "intr" and/or "soft" mounts, but since you are not using those, I don't know how you broke the session? (10052 is NFSERR_BADSESSION) You might want to do "nfsstat -m" on the client to see what options were actually negotiated for the mount and then check that neither "soft" nor "intr" are there. I suspect that the recovery thread in the client (called "nfscl") is somehow wedged and cannot do the recovery from the bad session, as well. A "ps axHl" on the client would be useful to see what the processes/threads are up to on the client when it is hung. If increasing the number of nfsd threads in the server doesn't resolve the problem, I'd guess it is some network weirdness caused by how the bhyve instance is networked to its host. (I always use bridging for bhyve instances and do NFS mounts, but I don't work those mounts hard.) Btw, "umount -N <mnt_path>" on the client will normally get rid of a hung mount, although it can take a couple of minutes to complete. rick Access to the share suddenly stalled. Server reports this in messages, every second: nfsrv_cache_session: no session IPaddr=192.168... Restarting nfsd and mountd didn't help, only now the client started to also report in messages, every second: nfs server 192.168...:/var/sysup/mnt/tmp.6.56160: is alive again Mounting the same share anew to a different place works fine. The network babble is this, every second: NFS request xid 1678997001 212 getattr fh 0,6/2 NFS reply xid 1678997001 reply ok 52 getattr ERROR: unk 10052 Forensics: I tried to build openoffice on that share, a couple of times. So there was a bit of traffic, and some things may have overflown. There seems to be no way to recover, only crashing the client.