From nobody Tue May 03 12:36:07 2022 X-Original-To: freebsd-arm@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 7DE081AC593D for ; Tue, 3 May 2022 12:36:23 +0000 (UTC) (envelope-from qroxana@protonmail.com) Received: from mail-4027.protonmail.ch (mail-4027.protonmail.ch [185.70.40.27]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "protonmail.com", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KszwQ3DK3z3CJb for ; Tue, 3 May 2022 12:36:22 +0000 (UTC) (envelope-from qroxana@protonmail.com) Date: Tue, 03 May 2022 12:36:07 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=protonmail.com; s=protonmail2; t=1651581375; bh=XupXUCnRJOhITh2B2Yh1HGJRUEUZrQgV9tVmeLA1RiE=; h=Date:To:From:Cc:Reply-To:Subject:Message-ID:In-Reply-To: References:Feedback-ID:From:To:Cc:Date:Subject:Reply-To: Feedback-ID:Message-ID; b=a4VzN04gxfN2R+s6qq6eD4PoU34yDtMqwSssY+3yoGdUzFFjmJjQrZ482He3C8+5u vsb1ZEssWe9k1HplDJ8odss0OPLSStgWznEoWIGZEZKRnV19058dYfno3ml4OUSKQN GfIU9gc5BElWBC43zxTiKYoFOhyWBBwwjIbtohwahZ38Cm/pys39lG6MtvlHNxJVeK i2UV9h1Q1s+kmyavTZWscDUxEgNWQ8pFfjWCpNdSTi6cROXOwzKHssJniOQJ2K0TBZ pioXayuVNv3XkX6F6s8w/PtOPkLpMc4zPWf8xpIMIw3BYLpmwNHe83Qd7C61zcQczt lm0whKegcZN0w== To: Kristof Provost From: qroxana Cc: freebsd-current@freebsd.org, freebsd-arm@freebsd.org Reply-To: qroxana Subject: Re: Kernel panic on armv7 when PF is enabled Message-ID: <83Y13opkQnjBkBmEsEU8Y9TJX06SXVmwSnCQfQCt0a6fInNoiEVaUcEnnrWr3h34dcFZJosg8AksGQr1v9zW_ljw5JIZIpBRm4qR5ga9FZM=@protonmail.com> In-Reply-To: References: Feedback-ID: 29996633:user:proton List-Id: Porting FreeBSD to ARM processors List-Archive: https://lists.freebsd.org/archives/freebsd-arm List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-arm@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 4KszwQ3DK3z3CJb X-Spamd-Bar: / Authentication-Results: mx1.freebsd.org; dkim=pass header.d=protonmail.com header.s=protonmail2 header.b=a4VzN04g; dmarc=pass (policy=quarantine) header.from=protonmail.com; spf=pass (mx1.freebsd.org: domain of qroxana@protonmail.com designates 185.70.40.27 as permitted sender) smtp.mailfrom=qroxana@protonmail.com X-Spamd-Result: default: False [-0.02 / 15.00]; HAS_REPLYTO(0.00)[qroxana@protonmail.com]; ARC_NA(0.00)[]; R_DKIM_ALLOW(-0.20)[protonmail.com:s=protonmail2]; REPLYTO_EQ_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3]; FREEMAIL_FROM(0.00)[protonmail.com]; R_SPF_ALLOW(-0.20)[+ip4:185.70.40.0/24]; MIME_GOOD(-0.10)[text/plain]; FREEMAIL_REPLYTO(0.00)[protonmail.com]; NEURAL_HAM_LONG(-0.98)[-0.981]; TO_DN_SOME(0.00)[]; NEURAL_SPAM_MEDIUM(0.96)[0.961]; TO_MATCH_ENVRCPT_SOME(0.00)[]; NEURAL_SPAM_SHORT(1.00)[0.999]; DKIM_TRACE(0.00)[protonmail.com:+]; DMARC_POLICY_ALLOW(-0.50)[protonmail.com,quarantine]; MLMMJ_DEST(0.00)[freebsd-arm]; RCVD_COUNT_ZERO(0.00)[0]; FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_ENVFROM(0.00)[protonmail.com]; ASN(0.00)[asn:62371, ipnet:185.70.40.0/24, country:CH]; MID_RHS_MATCH_FROM(0.00)[] X-ThisMailContainsUnwantedMimeParts: N On Monday, May 2nd, 2022 at 9:02 AM, Kristof Provost wrote= : > On 1 May 2022, at 5:13, qroxana wrote: > > > After git bisecting the panic started since this commit. > > > > commit 78bc3d5e1712bc1649aa5574d2b8d153f9665113 > > > > Author: Kristof Provost < > > kp@FreeBSD.org > > > > Date: Mon Feb 14 20:09:54 2022 +0100 > > > > vlan: allow net.link.vlan.mtag_pcp to be set per vnet > > > > The primary reason for this change is to facilitate testing. > > > > MFC after: 1 week > > > > sys/net/if_ethersubr.c | 9 +++++---- > > > > sys/net/if_vlan.c | 5 +++-- > > > > 2 files changed, 8 insertions(+), 6 deletions(-) > > > > The armv7 board boots from a NFS root, > > > > it can boot without any problem if PF is disabled. > > > > Any helps? > > > > add host ::1: gateway lo0 fib 0: route already in table > > add net fe80::: gateway ::1 > > add net ff02::: gateway ::1 > > add net ::ffff:0.0.0.0: gateway ::1 > > add net ::0.0.0.0: gateway ::1 > > Enabling pf. > > Kernel page fault with the following non-sleepable locks held: > > shared rm pf rulesets (pf rulesets) r =3D 0 (0xe3099430) locked @ /usr/= src/sys/netpfil/pf/pf.c:6493 > > exclusive rw tcpinp (tcpinp) r =3D 0 (0xdb748d88) locked @ /usr/src/sys= /netinet/tcp_usrreq.c:1008 > > stack backtrace: > > #0 0xc0355cac at witness_debugger+0x7c > > #1 0xc0356ef0 at witness_warn+0x3fc > > #2 0xc05ec048 at abort_handler+0x1d8 > > #3 0xc05cb5ac at exception_exit+0 > > #4 0xe3083c10 at pf_syncookie_validate+0x60 > > #5 0xe30496a8 at pf_test+0x518 > > #6 0xe306d768 at pf_check_out+0x30 > > #7 0xc0415b44 at pfil_run_hooks+0xbc > > #8 0xc0445cfc at ip_output+0xce8 > > #9 0xc045bc9c at tcp_default_output+0x20ac > > #10 0xc0471eb4 at tcp_usr_send+0x1ac > > #11 0xc0389464 at sosend_generic+0x490 > > #12 0xc0389790 at sosend+0x64 > > #13 0xc0502888 at clnt_vc_call+0x560 > > #14 0xc05009d8 at clnt_reconnect_call+0x170 > > #15 0xc01e7b14 at newnfs_request+0xb20 > > #16 0xc0230218 at nfscl_request+0x60 > > #17 0xc020d9bc at nfsrpc_getattr+0xb0 > > Fatal kernel mode data abort: 'Alignment Fault' on read > > trapframe: 0xdf1f1c90 > > FSR=3D00000001, FAR=3Dd7840264, spsr=3D40000013 > > r0 =3D6a228eda, r1 =3Ddac0d785, r2 =3Dd7840264, r3 =3Ddb5527c0 > > r4 =3Ddf1f1e00, r5 =3Ddac0d75f, r6 =3D00000018, r7 =3Dd9422c00 > > r8 =3Dc093e5e4, r9 =3D00000001, r10=3Ddf1f1f5c, r11=3Ddf1f1d38 > > r12=3De3098dd0, ssp=3Ddf1f1d20, slr=3De3083bdc, pc =3De3083c10 > > The commit you point at is entirely unrelated to the code where the panic= occurred, so I=E2=80=99m pretty sure something went wrong in your bisect. > > The backtrace would suggest the issue occurs in the pf_syncookie_validate= () function, and likely in the line `if (atomic_load_64(&V_pf_status.syncoo= kies_inflight[cookie.flags.oddeven]) =3D=3D 0)` > > The obvious way for that to panic would be to call it without the curvnet= context set, but pf_test() uses it earlier, so that=E2=80=99s going to be = fine. > > Given that this is unique to armv7 I=E2=80=99d recommend talking to the a= rmv7 maintainer about 64 bit atomic operations. > > You can probably avoid the atomic load with this patch (and not enabling = syncookie support): > > diff --git a/sys/netpfil/pf/pf_syncookies.c b/sys/netpfil/pf/pf_synco= okies.c > index 5230502be30c..c86d469d3cef 100644 > --- a/sys/netpfil/pf/pf_syncookies.c > +++ b/sys/netpfil/pf/pf_syncookies.c > @@ -313,6 +313,9 @@ pf_syncookie_validate(struct pf_pdesc *pd) > ack =3D ntohl(pd->hdr.tcp.th_ack) - 1; > cookie.cookie =3D (ack & 0xff) ^ (ack >> 24); > > + if (V_pf_status.syncookies_mode =3D=3D PF_SYNCOOKIES_NEVER) > + return (0); > + > /* we don't know oddeven before setting the cookie (union) */ > if (atomic_load_64(&V_pf_status.syncookies_inflight[cookie.f= lags.oddeven]) > =3D=3D 0) > > > That shouldn=E2=80=99t be required though. > > Br, > Kristof Thank you sir. You were right. I tested patch with the latest kernel. It can boot successfully with the patch, and still got kernel panic without the patch.