[Bug 275905] nfs client: mount becomes unresponsive

From: <bugzilla-noreply_at_freebsd.org>
Date: Sun, 24 Dec 2023 03:30:54 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=275905

            Bug ID: 275905
           Summary: nfs client: mount becomes unresponsive
           Product: Base System
           Version: 14.0-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Only Me
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: lexi.freebsd@le-fay.org

FreeBSD ilythia.eden.le-fay.org 14.0-RELEASE-p3 FreeBSD 14.0-RELEASE-p3 #0: Mon
Dec 11 04:56:01 UTC 2023    
root@amd64-builder.daemonology.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
amd64

this system has an NFSv4.2 mount from a FreeBSD 14.0 server using sec=krb5p. 
after a few hours of activity (mostly low-volume random reads at ~5-10Mbps
throughput), the NFS mount has hung and cannot be accessed; 'df' and 'nfsstat
-m' also hang.  there's no NFS network traffic and no indication in the logs on
either system of a problem.  the Kerberos ticket isn't expired, and renewing
the ticket anyway made no difference.

according to kgdb, nfscl seems to be stuck in nfsv4_sequencelookup:

(kgdb) where
#0  sched_switch (td=td@entry=0xfffffe017c43a740, flags=flags@entry=259) at
/usr/src/sys/kern/sched_ule.c:2297
#1  0xffffffff80b5028b in mi_switch (flags=flags@entry=259) at
/usr/src/sys/kern/kern_synch.c:548
#2  0xffffffff80ba001b in sleepq_switch (wchan=wchan@entry=0xfffff800754bd9c0,
pri=pri@entry=68) at /usr/src/sys/kern/subr_sleepqueue.c:607
#3  0xffffffff80ba05ef in sleepq_timedwait
(wchan=wchan@entry=0xfffff800754bd9c0, pri=68) at
/usr/src/sys/kern/subr_sleepqueue.c:689
#4  0xffffffff80b4f9e0 in _sleep (ident=ident@entry=0xfffff800754bd9c0,
lock=lock@entry=0xfffff800754bd810, priority=priority@entry=68,
wmesg=0xffffffff81195590 "nfsclseq", sbt=4294967000, pr=pr@entry=0, flags=256)
at /usr/src/sys/kern/kern_synch.c:219
#5  0xffffffff809ed270 in nfsv4_sequencelookup
(nmp=nmp@entry=0xfffff802f5506000, sep=sep@entry=0xfffff800754bd810,
slotposp=slotposp@entry=0xfffffe017b087b38,
maxslotp=maxslotp@entry=0xfffffe017b087b34,
slotseqp=slotseqp@entry=0xfffffe017b087b3c,
    sessionid=sessionid@entry=0xfffffe017b087b40
"k\214\020\201\377\377\377\377\377\242\352\200\377\377\377\377\n\340h
D\257\353\v ", fnd_init=false) at /usr/src/sys/fs/nfs/nfs_commonsubs.c:4989
#6  0xffffffff809e1c39 in nfsv4_setsequence (nmp=0xfffff802f5506000,
nd=0xfffffe017b087c38, sep=0xfffff800754bd810, dont_replycache=0,
cred=0xfffff800087f3c00) at /usr/src/sys/fs/nfs/nfs_commonsubs.c:4892
#7  0xffffffff809e12f1 in nfscl_reqstart (nd=0xfffffe017b087c38, procnum=32,
nmp=0xfffff802f5506000, nfhp=0x0, fhlen=0, opcntpp=<optimized out>,
sep=<optimized out>, vers=0, minorvers=0, cred=0xfffff800087f3c00) at
/usr/src/sys/fs/nfs/nfs_commonsubs.c:433
#8  0xffffffff80a11e77 in nfsrpc_renew (clp=<optimized out>, dsp=dsp@entry=0x0,
cred=cred@entry=0xfffff800087f3c00, p=p@entry=0xfffffe017c43a740) at
/usr/src/sys/fs/nfsclient/nfs_clrpcops.c:4874
#9  0xffffffff809f7ae5 in nfscl_renewthread (clp=clp@entry=0xfffffe0180c75000,
p=0xfffffe017c43a740) at /usr/src/sys/fs/nfsclient/nfs_clstate.c:2769
#10 0xffffffff80a2bde4 in start_nfscl (arg=<unavailable>,
arg@entry=0xfffffe0180c75000) at /usr/src/sys/fs/nfsclient/nfs_clport.c:782
#11 0xffffffff80afdb7f in fork_exit (callout=0xffffffff80a2bdd0 <start_nfscl>,
arg=0xfffffe0180c75000, frame=0xfffffe017b087f40) at
/usr/src/sys/kern/kern_fork.c:1160
#12 <signal handler called>
#13 0x00000328911e38ca in ?? ()
Backtrace stopped: Cannot access memory at address 0x3288e984798
(kgdb) frame 5
#5  0xffffffff809ed270 in nfsv4_sequencelookup
(nmp=nmp@entry=0xfffff802f5506000, sep=sep@entry=0xfffff800754bd810,
slotposp=slotposp@entry=0xfffffe017b087b38,
maxslotp=maxslotp@entry=0xfffffe017b087b34,
slotseqp=slotseqp@entry=0xfffffe017b087b3c,
    sessionid=sessionid@entry=0xfffffe017b087b40
"k\214\020\201\377\377\377\377\377\242\352\200\377\377\377\377\n\340h
D\257\353\v ", fnd_init=false) at /usr/src/sys/fs/nfs/nfs_commonsubs.c:4989
warning: Source file is more recent than executable.
4989                                    mtx_sleep(&sep->nfsess_slots,
&sep->nfsess_mtx,

-- 
You are receiving this mail because:
You are the assignee for the bug.