From nobody Mon Jan 15 15:04:44 2024 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4TDFmp2qvRz57740 for ; Mon, 15 Jan 2024 15:04:58 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4TDFmp26W5z4mfw for ; Mon, 15 Jan 2024 15:04:58 +0000 (UTC) (envelope-from rick.macklem@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-pg1-x530.google.com with SMTP id 41be03b00d2f7-5cf22a89a54so2329342a12.3 for ; Mon, 15 Jan 2024 07:04:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705331097; x=1705935897; darn=freebsd.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=CPj8wL5x9FF47KR2ZD8AWOGBFcCn2OhU8cAjwfDsSwc=; b=XyS2lK4nuNuKZj1gI2DMXcQc+wlcPDhDZ5e9L6iMrwzzKpx0WP39VLuyB+i7FH94zL AHZVEk7cQL6I/FoglgOKuIUVqziqYGw5aN80V3A1x2YJq7Pc2Os7zok8UAtXbMqFc3Xp UbeBvpTBrPhAI6FU1tKLkgviau8oJJtPTYM5g9ysmEvzc2GMkdNGy0QerQ9dcaaNOamH YY7N3G8NtGk/dIpoG/NfyCCDbIopvCAI/Ea8DyaqhpYZZsMs+F6tWPa472RlULS0RXhM 8MS/S7gHpFU5pCY9eE8XpCgHR2s5FLWGBtlWDBzuvOSfjsSYFNWYTdH/J3tyCzlCsvdb aE3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705331097; x=1705935897; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CPj8wL5x9FF47KR2ZD8AWOGBFcCn2OhU8cAjwfDsSwc=; b=Wntf0nGlrFN2Id3Nh9dmay3rNhCQvRc4rFx0kLw1/Uz247oMCPGyuvjaBVxgAwAFOE DOfgCz/db33fbPAqpZGf7knpWNYdEZAb6YUXbAcrk3x+C6I15fuKe1KhovvapRsibVXv 7ephPJqS6cNEkP0maWbl694K2qT33XLb7L+UQoFRRihItkNhbU9oD6hSA8SvYnaUimuh MTK7MWOPQjeG8fAYB8Dn8apooS3CLV+arAiUBAQzfrkQISM7KSY6JJkgeVsUkqFDru+v 7WyLT7Belk7g27mQRkOyeYBuPhdN7Y1EiHURYquhVyDLd26OYb0y+SYKnq6AuGfyM3yi wbQQ== X-Gm-Message-State: AOJu0YxcIlguBS1XTLhbeds1UcQqGvfFlASr8Frs+hY/5TXNtz3+I6xF 5xjtjj5ycqWNgIV72GfIIOsCzNt9Oj3YMbxGyg== X-Google-Smtp-Source: AGHT+IF6AXLl7Sz1ERW69qZfP5YcaVjgRcDDUeRXRnOQm7dmPFC/sqePcS6YHBvuDgQQG7zwuxcEY7OyIGNvlWvvvQs= X-Received: by 2002:a17:90a:3c86:b0:28d:2b16:b714 with SMTP id g6-20020a17090a3c8600b0028d2b16b714mr2543206pjc.27.1705331097064; Mon, 15 Jan 2024 07:04:57 -0800 (PST) List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 References: <20240113193324.3fd54295@thor.intern.walstatt.dynvpn.de> <1369645989.13766.1705178331205@localhost> <20240115043412.B6998C8@slippy.cwsent.com> <20240115064704.611fe0c4@thor.intern.walstatt.dynvpn.de> <683EF50F-6665-4664-A7CE-1EFE50076FB0@bsd4all.org> In-Reply-To: <683EF50F-6665-4664-A7CE-1EFE50076FB0@bsd4all.org> From: Rick Macklem Date: Mon, 15 Jan 2024 07:04:44 -0800 Message-ID: Subject: Re: NFSv4 crash of CURRENT To: Peter Blok Cc: FreeBSD User , Cy Schubert , Ronald Klop , FreeBSD CURRENT Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4TDFmp26W5z4mfw X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2607:f8b0::/32, country:US] On Mon, Jan 15, 2024 at 2:53=E2=80=AFAM Peter Blok wrot= e: > > Hi, > > Forgot to mention I=E2=80=99m on 13-stable. The fix that is causing the c= rash with automounted NFS is: > > commit cc5cda1dbaa907ce52074f47264cc45b5a7d6c8b > Author: Konstantin Belousov > Date: Tue Jan 2 00:22:44 2024 +0200 > > nfsclient: limit situations when we do unlocked read-ahead by nfsiod > > (cherry picked from commit 70dc6b2ce314a0f32755005ad02802fca7ed186e) > > When I remove the fix, the problem is gone. Add it back and the crash hap= pens. Kostik has already come up with a probable fix. If you want it right away, here it is, but he'll probably commit it soon anyhow: diff --git a/sys/fs/nfsclient/nfs_clbio.c b/sys/fs/nfsclient/nfs_clbio.c index c027d7d7c3fd..1cf45bb0c924 100644 --- a/sys/fs/nfsclient/nfs_clbio.c +++ b/sys/fs/nfsclient/nfs_clbio.c @@ -414,6 +414,18 @@ nfs_bioread_check_cons(struct vnode *vp, struct thread *td, struct ucred *cred) return (error); } +static bool +ncl_bioread_dora(struct vnode *vp) +{ + vm_object_t obj; + + obj =3D vp->v_object; + if (obj =3D=3D NULL) + return (true); + return (!vm_object_mightbedirty(vp->v_object) && + vp->v_object->un_pager.vnp.writemappings =3D=3D 0); +} + /* * Vnode op for read using bio */ @@ -486,9 +498,7 @@ ncl_bioread(struct vnode *vp, struct uio *uio, int ioflag, struct ucred *cred) * unlocked read by nfsiod could obliterate changes * done by userspace. */ - if (nmp->nm_readahead > 0 && - !vm_object_mightbedirty(vp->v_object) && - vp->v_object->un_pager.vnp.writemappings =3D=3D 0) { + if (nmp->nm_readahead > 0 && ncl_bioread_dora(vp)) { for (nra =3D 0; nra < nmp->nm_readahead && nra < seqcou= nt && (off_t)(lbn + 1 + nra) * biosize < nsize; nra++) { rabn =3D lbn + 1 + nra; @@ -675,9 +685,7 @@ ncl_bioread(struct vnode *vp, struct uio *uio, int ioflag, struct ucred *cred) * directory offset cookie of the next block.) */ NFSLOCKNODE(np); - if (nmp->nm_readahead > 0 && - !vm_object_mightbedirty(vp->v_object) && - vp->v_object->un_pager.vnp.writemappings =3D=3D 0 && + if (nmp->nm_readahead > 0 && ncl_bioread_dora(vp) && (bp->b_flags & B_INVAL) =3D=3D 0 && (np->n_direofoffset =3D=3D 0 || (lbn + 1) * NFS_DIRBLKSIZ < np->n_direofoffset) && rick ps: It appears that autofs causes the directory to be read before it is open'd for some reason. I've never looked at autofs. > > Peter > > On 15 Jan 2024, at 09:31, Peter Blok wrote: > > Hi, > > I do have a crash on a NFS client with stable of today (4c4633fdffbe8e4b6= d328c2bc9bb3edacc9ab50a). It is also autofs related. Maybe it is the same p= roblem. > > I have ports automounted on /am/ports. When I do cd /am/ports/sys and typ= e tab to autocomplete it crashes with the below stack trace. If I plainly m= ount ports on /usr/ports and do the same everything works. I am using NFSv3 > > Peter > > > > > Fatal trap 12: page fault while in kernel mode > cpuid =3D 2; apic id =3D 04 > fault virtual address =3D 0x89 > fault code =3D supervisor read data, page not present > instruction pointer =3D 0x20:0xffffffff809645d4 > stack pointer =3D 0x28:0xfffffe00acadb830 > frame pointer =3D 0x28:0xfffffe00acadb830 > code segment =3D base 0x0, limit 0xfffff, type 0x1b > =3D DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags =3D interrupt enabled, resume, IOPL =3D 0 > current process =3D 6869 (csh) > trap number =3D 12 > panic: page fault > cpuid =3D 2 > time =3D 1705306940 > KDB: stack backtrace: > #0 0xffffffff806232f5 at kdb_backtrace+0x65 > #1 0xffffffff805d7a02 at vpanic+0x152 > #2 0xffffffff805d78a3 at panic+0x43 > #3 0xffffffff809d58ad at trap_fatal+0x38d > #4 0xffffffff809d58ff at trap_pfault+0x4f > #5 0xffffffff809af048 at calltrap+0x8 > #6 0xffffffff804c7a7e at ncl_bioread+0xb7e > #7 0xffffffff804b9d90 at nfs_readdir+0x1f0 > #8 0xffffffff8069c61a at vop_sigdefer+0x2a > #9 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20 > #10 0xffffffff81ce75de at autofs_readdir+0x2ce > #11 0xffffffff809f8ae0 at VOP_READDIR_APV+0x20 > #12 0xffffffff806c3002 at kern_getdirentries+0x222 > #13 0xffffffff806c33a9 at sys_getdirentries+0x29 > #14 0xffffffff809d6180 at amd64_syscall+0x110 > #15 0xffffffff809af95b at fast_syscall_common+0xf8 > > > > On 15 Jan 2024, at 06:46, FreeBSD User wrote: > > Am Sun, 14 Jan 2024 20:34:12 -0800 > Cy Schubert schrieb: > > In message om> > , Rick Macklem writes: > > On Sat, Jan 13, 2024 at 12:39=3DE2=3D80=3DAFPM Ronald Klop =3D > wrote: > > > > Van: FreeBSD User > Datum: 13 januari 2024 19:34 > Aan: FreeBSD CURRENT > Onderwerp: NFSv4 crash of CURRENT > > Hello, > > running CURRENT client (FreeBSD 15.0-CURRENT #4 main-n267556-69748e62e82a= =3D > > : Sat Jan 13 18:08:32 > > CET 2024 amd64). One NFSv4 server is same OS revision as the mentioned cl= =3D > > ient, other is FreeBSD > > 13.2-RELEASE-p8. Both offer NFSv4 filesystems, non-kerberized. > > I can crash the client reproducable by accessing the one or other NFSv4 F= =3D > > S (a simple ls -la). > > The NFSv4 FS is backed by ZFS (if this matters). I do not have physicla a= =3D > > ccess to the client > > host, luckily the box recovers. > > Did you rebuild both the nfscommon and nfscl modules from the same source= s? > I did a commit to main that changes the interface between these two > modules and did bump the > __FreeBSD_version to 1500010, which should cause both to be rebuilt. > (If you have "options NFSCL" in your kernel config, both should have > been rebuilt as a part of > the kernel build.) > > > Is anyone by chance seeing autofs in the backtrace too? > > > > Hello Cy Shubert, > > I forgot to mention that those crashes occur with autofs mounted filesyst= ems. Good question, > by the way, I will check whether crashes also happen when mounting the tr= adidional way. > > Kind regards, > > oh > > -- > O. Hartmann > > >