Re: NFSv4 crash of CURRENT

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Mon, 15 Jan 2024 20:29:14 UTC
On Mon, Jan 15, 2024 at 11:03 AM FreeBSD User <freebsd@walstatt-de.de> wrote:
>
> Am Mon, 15 Jan 2024 11:53:31 +0100
> Peter Blok <pblok@bsd4all.org> schrieb:
>
> > Hi,
> >
> > Forgot to mention I’m on 13-stable. The fix that is causing the crash with automounted NFS
> > is:
> >
> > commit cc5cda1dbaa907ce52074f47264cc45b5a7d6c8b
> > Author: Konstantin Belousov <kib@FreeBSD.org>
> > Date:   Tue Jan 2 00:22:44 2024 +0200
> >
> >     nfsclient: limit situations when we do unlocked read-ahead by nfsiod
> >
> >     (cherry picked from commit 70dc6b2ce314a0f32755005ad02802fca7ed186e)
> >
> > When I remove the fix, the problem is gone. Add it back and the crash happens.
> >
> > Peter
> >
> > > On 15 Jan 2024, at 09:31, Peter Blok <pblok@bsd4all.org> wrote:
> > >
> > > Hi,
> > >
> > > I do have a crash on a NFS client with stable of today
> > > (4c4633fdffbe8e4b6d328c2bc9bb3edacc9ab50a). It is also autofs related. Maybe it is the
> > > same problem.
> > >
> > > I have ports automounted on /am/ports. When I do cd /am/ports/sys and type tab to
> > > autocomplete it crashes with the below stack trace. If I plainly mount ports on /usr/ports
> > > and do the same everything works. I am using NFSv3
> > >
> > > Peter
> > >
> > >
> > >
> > >
> > > Fatal trap 12: page fault while in kernel mode
> > > cpuid = 2; apic id = 04
> > > fault virtual address       = 0x89
> > > fault code          = supervisor read data, page not present
> > > instruction pointer = 0x20:0xffffffff809645d4
> > > stack pointer               = 0x28:0xfffffe00acadb830
> > > frame pointer               = 0x28:0xfffffe00acadb830
> > > code segment                = base 0x0, limit 0xfffff, type 0x1b
> > >                     = DPL 0, pres 1, long 1, def32 0, gran 1
> > > processor eflags    = interrupt enabled, resume, IOPL = 0
> > > current process             = 6869 (csh)
> > > trap number         = 12
> > > panic: page fault
> > > cpuid = 2
> > > time = 1705306940
> > > KDB: stack backtrace:
> > > #0 0xffffffff806232f5 at kdb_backtrace+0x65
> > > #1 0xffffffff805d7a02 at vpanic+0x152
> > > #2 0xffffffff805d78a3 at panic+0x43
> > > #3 0xffffffff809d58ad at trap_fatal+0x38d
> > > #4 0xffffffff809d58ff at trap_pfault+0x4f
> > > #5 0xffffffff809af048 at calltrap+0x8
> > > #6 0xffffffff804c7a7e at ncl_bioread+0xb7e
> > > #7 0xffffffff804b9d90 at nfs_readdir+0x1f0
> > > #8 0xffffffff8069c61a at vop_sigdefer+0x2a
> > > #9 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20
> > > #10 0xffffffff81ce75de at autofs_readdir+0x2ce
> > > #11 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20
> > > #12 0xffffffff806c3002 at kern_getdirentries+0x222
> > > #13 0xffffffff806c33a9 at sys_getdirentries+0x29
> > > #14 0xffffffff809d6180 at amd64_syscall+0x110
> > > #15 0xffffffff809af95b at fast_syscall_common+0xf8
> > >
> > >
> > >
> > >> On 15 Jan 2024, at 06:46, FreeBSD User <freebsd@walstatt-de.de
> > >> <mailto:freebsd@walstatt-de.de>> wrote:
> > >>
> > >> Am Sun, 14 Jan 2024 20:34:12 -0800
> > >> Cy Schubert <Cy.Schubert@cschubert.com <mailto:Cy.Schubert@cschubert.com>> schrieb:
> > >>
> > >>> In message <CAM5tNy5aat8vUn2fsX9jV=D9yGZdnO20Q0Ea7qtszx+zSES2bw@mail.gmail.c
> > >>> <mailto:CAM5tNy5aat8vUn2fsX9jV=D9yGZdnO20Q0Ea7qtszx+zSES2bw@mail.gmail.c>
> > >>> om>
> > >>> , Rick Macklem writes:
> > >>>> On Sat, Jan 13, 2024 at 12:39=E2=80=AFPM Ronald Klop <ronald-lists@klop.ws
> > >>>> <mailto:ronald-lists@klop.ws>>= wrote:
> > >>>>>
> > >>>>>
> > >>>>> Van: FreeBSD User <freebsd@walstatt-de.de <mailto:freebsd@walstatt-de.de>>
> > >>>>> Datum: 13 januari 2024 19:34
> > >>>>> Aan: FreeBSD CURRENT <freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org>>
> > >>>>> Onderwerp: NFSv4 crash of CURRENT
> > >>>>>
> > >>>>> Hello,
> > >>>>>
> > >>>>> running CURRENT client (FreeBSD 15.0-CURRENT #4 main-n267556-69748e62e82a=
> > >>>> : Sat Jan 13 18:08:32
> > >>>>> CET 2024 amd64). One NFSv4 server is same OS revision as the mentioned cl=
> > >>>> ient, other is FreeBSD
> > >>>>> 13.2-RELEASE-p8. Both offer NFSv4 filesystems, non-kerberized.
> > >>>>>
> > >>>>> I can crash the client reproducable by accessing the one or other NFSv4 F=
> > >>>> S (a simple ls -la).
> > >>>>> The NFSv4 FS is backed by ZFS (if this matters). I do not have physicla a=
> > >>>> ccess to the client
> > >>>>> host, luckily the box recovers.
> > >>>> Did you rebuild both the nfscommon and nfscl modules from the same sources?
> > >>>> I did a commit to main that changes the interface between these two
> > >>>> modules and did bump the
> > >>>> __FreeBSD_version to 1500010, which should cause both to be rebuilt.
> > >>>> (If you have "options NFSCL" in your kernel config, both should have
> > >>>> been rebuilt as a part of
> > >>>> the kernel build.)
> > >>>>
> > >>>
> > >>> Is anyone by chance seeing autofs in the backtrace too?
> > >>>
> > >>>
> > >>
> > >> Hello Cy Shubert,
> > >>
> > >> I forgot to mention that those crashes occur with autofs mounted filesystems. Good
> > >> question, by the way, I will check whether crashes also happen when mounting the
> > >> tradidional way.
> > >>
> > >> Kind regards,
> > >>
> > >> oh
> > >>
> > >> --
> > >> O. Hartmann
> > >
> >
>
> good catch!
Don't thank me, thank Kostik. He's already committed the patch.
Btw, I didn't look at fixing this because I knew Kostik would fix it
before I had it figured out;-)

Thanks everyone for reporting it, rick

>
> --
> O. Hartmann