Re: FreeBSD panics possibly caused by nfs clients
- In reply to: Zaphod Beeblebrox : "Re: FreeBSD panics possibly caused by nfs clients"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Fri, 09 Feb 2024 22:20:28 UTC
On Fri, Feb 9, 2024 at 2:04 PM Zaphod Beeblebrox <zbeeble@gmail.com> wrote: > > Just in case it's relevant, I'm carrying around this patch on my fairly busy little RISC-V machine. > > diff --git a/sys/fs/nfsclient/nfs_clvnops.c b/sys/fs/nfsclient/nfs_clvnops.c > index 0b8c587a542c..85c0ebd7a10f 100644 > --- a/sys/fs/nfsclient/nfs_clvnops.c > +++ b/sys/fs/nfsclient/nfs_clvnops.c > @@ -2459,6 +2459,16 @@ nfs_readdir(struct vop_readdir_args *ap) > return (EINVAL); > uio->uio_resid -= left; > > + /* > + * For readdirplus, if starting to read the directory, > + * purge the name cache, since it will be reloaded by > + * this directory read. > + * This removes potentially stale name cache entries. > + */ > + if (uio->uio_offset == 0 && > + (VFSTONFS(vp->v_mount)->nm_flag & NFSMNT_RDIRPLUS) != 0) > + cache_purge(vp); > + > /* > * Call ncl_bioread() to do the real work. > */ > ... without it, I can panic. This is not of interest to Matthew, since he is using Linux clients against a FreeBSD server. However, it is of interest to me. This is the first time I've seen this (unless I just forgot;-) and since readdirplus is not a default, I suspect few test/use it. I will take a look at this, since it sounds reasonable. Thanks for posting it, rick > > > On Fri, Feb 9, 2024 at 4:18 PM Mark Johnston <markj@freebsd.org> wrote: >> >> On Fri, Feb 09, 2024 at 06:23:08PM +0000, Matthew L. Dailey wrote: >> > I had my first kernel panic with a KASAN kernel after only 01:27. This >> > first panic was a "double fault," which isn't anything we've seen >> > previously - usually we've seen trap 9 or trap 12, but sometimes others. >> > Based on the backtrace, it definitely looks like KASAN caught something, >> > but I don't have the expertise to know if this points to anything >> > specific. From the backtrace, it looks like this might have originated >> > in ipfw code. >> >> A double fault is rather unexpected. I presume you're running >> releng/14.0? Is it at all possible to test with FreeBSD-CURRENT? >> >> Did you add INVARIANTS etc. to the kernel configuration used here, or >> just KASAN? >> >> > Please let me know what other info I can provide or what I can do to dig >> > deeper. >> >> If you could repeat the test several times, I'd be interested in seeing >> if you always get the same result. If you're willing to share the >> vmcore (or several), I'd be willing to take a look at it. >> >> > Thanks!! >> > >> > Panic message: >> > [5674] Fatal double fault >> > [5674] rip 0xffffffff812f6e32 rsp 0xfffffe014677afe0 rbp 0xfffffe014677b430 >> > [5674] rax 0x1fffffc028cef620 rdx 0xf2f2f2f8f2f2f2f2 rbx 0x1 >> > [5674] rcx 0xdffff7c000000000 rsi 0xfffffe004086a4a0 rdi 0xf8f8f8f8f2f2f2f8 >> > [5674] r8 0xf8f8f8f8f8f8f8f8 r9 0x162a r10 0x835003002d3a64e1 >> > [5674] r11 0 r12 0xfffff78028cef620 r13 0xfffffe004086a440 >> > [5674] r14 0xfffffe01488c0560 r15 0x26f40 rflags 0x10006 >> > [5674] cs 0x20 ss 0x28 ds 0x3b es 0x3b fs 0x13 gs 0x1b >> > [5674] fsbase 0x95d1d81a130 gsbase 0xffffffff84a14000 kgsbase 0 >> > [5674] cpuid = 4; apic id = 08 >> > [5674] panic: double fault >> > [5674] cpuid = 4 >> > [5674] time = 1707498420 >> > [5674] KDB: stack backtrace: >> > [5674] Uptime: 1h34m34s >> > >> > Backtrace: >> > #0 __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:57 >> > #1 doadump (textdump=<optimized out>) at >> > /usr/src/sys/kern/kern_shutdown.c:405 >> > #2 0xffffffff8128b7dc in kern_reboot (howto=howto@entry=260) >> > at /usr/src/sys/kern/kern_shutdown.c:526 >> > #3 0xffffffff8128c000 in vpanic ( >> > fmt=fmt@entry=0xffffffff82589a00 <str> "double fault", >> > ap=ap@entry=0xfffffe0040866de0) at >> > /usr/src/sys/kern/kern_shutdown.c:970 >> > #4 0xffffffff8128bd75 in panic (fmt=0xffffffff82589a00 <str> "double >> > fault") >> > at /usr/src/sys/kern/kern_shutdown.c:894 >> > #5 0xffffffff81c4b335 in dblfault_handler (frame=<optimized out>) >> > at /usr/src/sys/amd64/amd64/trap.c:1012 >> > #6 <signal handler called> >> > #7 0xffffffff812f6e32 in sched_clock (td=td@entry=0xfffffe01488c0560, >> > cnt=cnt@entry=1) at /usr/src/sys/kern/sched_ule.c:2601 >> > #8 0xffffffff8119e2a7 in statclock (cnt=cnt@entry=1, >> > usermode=usermode@entry=0) at /usr/src/sys/kern/kern_clock.c:760 >> > #9 0xffffffff8119fb67 in handleevents (now=now@entry=24371855699832, >> > fake=fake@entry=0) at /usr/src/sys/kern/kern_clocksource.c:195 >> > #10 0xffffffff811a10cc in timercb (et=<optimized out>, arg=<optimized out>) >> > at /usr/src/sys/kern/kern_clocksource.c:353 >> > #11 0xffffffff81dcd280 in lapic_handle_timer (frame=0xfffffe014677b750) >> > at /usr/src/sys/x86/x86/local_apic.c:1343 >> > #12 <signal handler called> >> > #13 __asan_load8_noabort (addr=18446741880219689232) >> > at /usr/src/sys/kern/subr_asan.c:1113 >> > #14 0xffffffff851488b8 in ?? () from /boot/thayer/ipfw.ko >> > #15 0xfffffe0100000000 in ?? () >> > #16 0xffffffff8134dcd5 in pcpu_find (cpuid=1238425856) >> > at /usr/src/sys/kern/subr_pcpu.c:286 >> > #17 0xffffffff85151f6f in ?? () from /boot/thayer/ipfw.ko >> > #18 0x0000000000000000 in ?? () >>