Re: NFSv4 crash of CURRENT
- Reply: Peter Blok : "Re: NFSv4 crash of CURRENT"
- In reply to: Rick Macklem : "Re: NFSv4 crash of CURRENT"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Mon, 15 Jan 2024 15:13:49 UTC
I can give it a shot on one of my clients. > On 15 Jan 2024, at 16:04, Rick Macklem <rick.macklem@gmail.com> wrote: > > On Mon, Jan 15, 2024 at 2:53 AM Peter Blok <pblok@bsd4all.org <mailto:pblok@bsd4all.org>> wrote: >> >> Hi, >> >> Forgot to mention I’m on 13-stable. The fix that is causing the crash with automounted NFS is: >> >> commit cc5cda1dbaa907ce52074f47264cc45b5a7d6c8b >> Author: Konstantin Belousov <kib@FreeBSD.org> >> Date: Tue Jan 2 00:22:44 2024 +0200 >> >> nfsclient: limit situations when we do unlocked read-ahead by nfsiod >> >> (cherry picked from commit 70dc6b2ce314a0f32755005ad02802fca7ed186e) >> >> When I remove the fix, the problem is gone. Add it back and the crash happens. > Kostik has already come up with a probable fix. If you want it right > away, here it is, > but he'll probably commit it soon anyhow: > diff --git a/sys/fs/nfsclient/nfs_clbio.c b/sys/fs/nfsclient/nfs_clbio.c > index c027d7d7c3fd..1cf45bb0c924 100644 > --- a/sys/fs/nfsclient/nfs_clbio.c > +++ b/sys/fs/nfsclient/nfs_clbio.c > @@ -414,6 +414,18 @@ nfs_bioread_check_cons(struct vnode *vp, struct > thread *td, struct ucred *cred) > return (error); > } > > +static bool > +ncl_bioread_dora(struct vnode *vp) > +{ > + vm_object_t obj; > + > + obj = vp->v_object; > + if (obj == NULL) > + return (true); > + return (!vm_object_mightbedirty(vp->v_object) && > + vp->v_object->un_pager.vnp.writemappings == 0); > +} > + > /* > * Vnode op for read using bio > */ > @@ -486,9 +498,7 @@ ncl_bioread(struct vnode *vp, struct uio *uio, int > ioflag, struct ucred *cred) > * unlocked read by nfsiod could obliterate changes > * done by userspace. > */ > - if (nmp->nm_readahead > 0 && > - !vm_object_mightbedirty(vp->v_object) && > - vp->v_object->un_pager.vnp.writemappings == 0) { > + if (nmp->nm_readahead > 0 && ncl_bioread_dora(vp)) { > for (nra = 0; nra < nmp->nm_readahead && nra < seqcount && > (off_t)(lbn + 1 + nra) * biosize < nsize; nra++) { > rabn = lbn + 1 + nra; > @@ -675,9 +685,7 @@ ncl_bioread(struct vnode *vp, struct uio *uio, int > ioflag, struct ucred *cred) > * directory offset cookie of the next block.) > */ > NFSLOCKNODE(np); > - if (nmp->nm_readahead > 0 && > - !vm_object_mightbedirty(vp->v_object) && > - vp->v_object->un_pager.vnp.writemappings == 0 && > + if (nmp->nm_readahead > 0 && ncl_bioread_dora(vp) && > (bp->b_flags & B_INVAL) == 0 && > (np->n_direofoffset == 0 || > (lbn + 1) * NFS_DIRBLKSIZ < np->n_direofoffset) && > > rick > ps: It appears that autofs causes the directory to be read before it > is open'd for > some reason. I've never looked at autofs. > >> >> Peter >> >> On 15 Jan 2024, at 09:31, Peter Blok <pblok@bsd4all.org> wrote: >> >> Hi, >> >> I do have a crash on a NFS client with stable of today (4c4633fdffbe8e4b6d328c2bc9bb3edacc9ab50a). It is also autofs related. Maybe it is the same problem. >> >> I have ports automounted on /am/ports. When I do cd /am/ports/sys and type tab to autocomplete it crashes with the below stack trace. If I plainly mount ports on /usr/ports and do the same everything works. I am using NFSv3 >> >> Peter >> >> >> >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 2; apic id = 04 >> fault virtual address = 0x89 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0xffffffff809645d4 >> stack pointer = 0x28:0xfffffe00acadb830 >> frame pointer = 0x28:0xfffffe00acadb830 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 6869 (csh) >> trap number = 12 >> panic: page fault >> cpuid = 2 >> time = 1705306940 >> KDB: stack backtrace: >> #0 0xffffffff806232f5 at kdb_backtrace+0x65 >> #1 0xffffffff805d7a02 at vpanic+0x152 >> #2 0xffffffff805d78a3 at panic+0x43 >> #3 0xffffffff809d58ad at trap_fatal+0x38d >> #4 0xffffffff809d58ff at trap_pfault+0x4f >> #5 0xffffffff809af048 at calltrap+0x8 >> #6 0xffffffff804c7a7e at ncl_bioread+0xb7e >> #7 0xffffffff804b9d90 at nfs_readdir+0x1f0 >> #8 0xffffffff8069c61a at vop_sigdefer+0x2a >> #9 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20 >> #10 0xffffffff81ce75de at autofs_readdir+0x2ce >> #11 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20 >> #12 0xffffffff806c3002 at kern_getdirentries+0x222 >> #13 0xffffffff806c33a9 at sys_getdirentries+0x29 >> #14 0xffffffff809d6180 at amd64_syscall+0x110 >> #15 0xffffffff809af95b at fast_syscall_common+0xf8 >> >> >> >> On 15 Jan 2024, at 06:46, FreeBSD User <freebsd@walstatt-de.de> wrote: >> >> Am Sun, 14 Jan 2024 20:34:12 -0800 >> Cy Schubert <Cy.Schubert@cschubert.com> schrieb: >> >> In message <CAM5tNy5aat8vUn2fsX9jV=D9yGZdnO20Q0Ea7qtszx+zSES2bw@mail.gmail.c >> om> >> , Rick Macklem writes: >> >> On Sat, Jan 13, 2024 at 12:39=E2=80=AFPM Ronald Klop <ronald-lists@klop.ws>= >> wrote: >> >> >> >> Van: FreeBSD User <freebsd@walstatt-de.de> >> Datum: 13 januari 2024 19:34 >> Aan: FreeBSD CURRENT <freebsd-current@freebsd.org> >> Onderwerp: NFSv4 crash of CURRENT >> >> Hello, >> >> running CURRENT client (FreeBSD 15.0-CURRENT #4 main-n267556-69748e62e82a= >> >> : Sat Jan 13 18:08:32 >> >> CET 2024 amd64). One NFSv4 server is same OS revision as the mentioned cl= >> >> ient, other is FreeBSD >> >> 13.2-RELEASE-p8. Both offer NFSv4 filesystems, non-kerberized. >> >> I can crash the client reproducable by accessing the one or other NFSv4 F= >> >> S (a simple ls -la). >> >> The NFSv4 FS is backed by ZFS (if this matters). I do not have physicla a= >> >> ccess to the client >> >> host, luckily the box recovers. >> >> Did you rebuild both the nfscommon and nfscl modules from the same sources? >> I did a commit to main that changes the interface between these two >> modules and did bump the >> __FreeBSD_version to 1500010, which should cause both to be rebuilt. >> (If you have "options NFSCL" in your kernel config, both should have >> been rebuilt as a part of >> the kernel build.) >> >> >> Is anyone by chance seeing autofs in the backtrace too? >> >> >> >> Hello Cy Shubert, >> >> I forgot to mention that those crashes occur with autofs mounted filesystems. Good question, >> by the way, I will check whether crashes also happen when mounting the tradidional way. >> >> Kind regards, >> >> oh >> >> -- >> O. Hartmann