From nobody Mon May 02 09:02:53 2022 X-Original-To: freebsd-current@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 6A5311ABDC1A; Mon, 2 May 2022 09:02:57 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: from smtp.freebsd.org (smtp.freebsd.org [96.47.72.83]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4KsHDd2WDFz4rh2; Mon, 2 May 2022 09:02:57 +0000 (UTC) (envelope-from kp@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1651482177; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SzB63xMfF68oKDS5871Hon0krdh6z5t9G3RCEDEf2eM=; b=WxQBJ7su0zWHHNNip8+KEzm1TOwTiRCNMfJa6bfokBxdFgpguH8ztPrf77F2CVkiGNBgOF 2+LhuQfrBuNIgKbH+/hmIZAoUpOGfoe5UoWw+cPlx1E2rQgcDFmgF3q8x9Vo8xlSuyicxI cPVb/FNxtWzv+0igHSNvyZS1aeWPLA1CSOvFdCb8HZrt8/mVsxNFf8G28mgPl27I4dxWcW bbukoEEifWOWWXHfJZvyhmY+o1RzRPRknq5f+r21ON4yd/QBM5fPSssBBlVr3IJqKYO3MS /WOBiPkMm7EIXJlS9hJS7CcRz41M+fzz4JbUYSdNkfKwejrXy2vPHKtuTdH0jA== Received: from venus.codepro.be (venus.codepro.be [5.9.86.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "mx1.codepro.be", Issuer "R3" (verified OK)) (Authenticated sender: kp) by smtp.freebsd.org (Postfix) with ESMTPSA id 0E24A7904; Mon, 2 May 2022 09:02:57 +0000 (UTC) (envelope-from kp@FreeBSD.org) Received: by venus.codepro.be (Postfix, authenticated sender kp) id 8280211A66; Mon, 2 May 2022 11:02:54 +0200 (CEST) From: Kristof Provost To: qroxana Cc: freebsd-current@freebsd.org, freebsd-arm@freebsd.org Subject: Re: Kernel panic on armv7 when PF is enabled Date: Mon, 02 May 2022 11:02:53 +0200 X-Mailer: MailMate (1.14r5852) Message-ID: In-Reply-To: References: List-Id: Discussions about the use of FreeBSD-current List-Archive: https://lists.freebsd.org/archives/freebsd-current List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-current@freebsd.org MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="=_MailMate_43ACE16C-8FBB-4703-A9C0-5E8ED6BC4AE9_=" Content-Transfer-Encoding: 8bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1651482177; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SzB63xMfF68oKDS5871Hon0krdh6z5t9G3RCEDEf2eM=; b=lEecktoBnLy1JivDk8uNkhhKS3zFBxoUEvlsl9/gN5YaV6QoiQRKJPFH+7NxapimBDBLL3 nM3OGnpDmXlJ58pXl0qyPIsLWHr23pA1BsCzuKYPF85TxZaB5WAT+NfyWEE1DemIvKPktw 7wxGvsVOLQAbj+EfFI5dDqDfbsEiK7PYN2S3UHDs9vSjkP5xZQquiGGVhh7wW4Jm3k5B+H P7rPwHmT1Vzizo8odZGRmhBKiiLZ7suocHqTgLtq03FlQScyob+7L28tnVdspstSD+dNm2 WLwI6NhvPd8qNjfOW9hNuV9AhuYv7bjaevRMrQ+SvmqPDLcmnkuchvzOhHjIMg== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1651482177; a=rsa-sha256; cv=none; b=i3c8526DS6JIz+CpUedqxgwxL4Pfi2YDQCQg9qGdcvBkqgOhmDNWhn2FoWPE9B28F9GHF9 v2lQZzAEMnFzrg5JcX082/Gi3HfeCcp8GWpjxvqe/eYrnvgn353lE7IJLg4uSDexcrNLft oK23jWsPcIy5m//ebOPv3j0RqErsWWYgnQyzAypbeB6mU1fWUwyMej0l+vJBXUJmM2SB+c arnlmSH+FIZuB8beRwv5iNkYOgAd5DeKbUUu8Tdpm0YTHLPZBAT4O8OAdqJhrrxknGBJab 0YHvpra0ruwP2eOFuD/bQeAoTLzFJEc6RE3Xf5133b9oWs24sB9kS54jD2qjkA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N --=_MailMate_43ACE16C-8FBB-4703-A9C0-5E8ED6BC4AE9_= Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 1 May 2022, at 5:13, qroxana wrote: > After git bisecting the panic started since this commit. > > commit 78bc3d5e1712bc1649aa5574d2b8d153f9665113 > > Author: Kristof Provost < > kp@FreeBSD.org >> > > Date: Mon Feb 14 20:09:54 2022 +0100 > > vlan: allow net.link.vlan.mtag_pcp to be set per vnet > > The primary reason for this change is to facilitate testing. > > MFC after: 1 week > > sys/net/if_ethersubr.c | 9 +++++---- > > sys/net/if_vlan.c | 5 +++-- > > 2 files changed, 8 insertions(+), 6 deletions(-) > > The armv7 board boots from a NFS root, > > it can boot without any problem if PF is disabled. > > Any helps? > > add host ::1: gateway lo0 fib 0: route already in table > add net fe80::: gateway ::1 > add net ff02::: gateway ::1 > add net ::ffff:0.0.0.0: gateway ::1 > add net ::0.0.0.0: gateway ::1 > Enabling pf. > Kernel page fault with the following non-sleepable locks held: > shared rm pf rulesets (pf rulesets) r = 0 (0xe3099430) locked @ > /usr/src/sys/netpfil/pf/pf.c:6493 > exclusive rw tcpinp (tcpinp) r = 0 (0xdb748d88) locked @ > /usr/src/sys/netinet/tcp_usrreq.c:1008 > stack backtrace: > #0 0xc0355cac at witness_debugger+0x7c > #1 0xc0356ef0 at witness_warn+0x3fc > #2 0xc05ec048 at abort_handler+0x1d8 > #3 0xc05cb5ac at exception_exit+0 > #4 0xe3083c10 at pf_syncookie_validate+0x60 > #5 0xe30496a8 at pf_test+0x518 > #6 0xe306d768 at pf_check_out+0x30 > #7 0xc0415b44 at pfil_run_hooks+0xbc > #8 0xc0445cfc at ip_output+0xce8 > #9 0xc045bc9c at tcp_default_output+0x20ac > #10 0xc0471eb4 at tcp_usr_send+0x1ac > #11 0xc0389464 at sosend_generic+0x490 > #12 0xc0389790 at sosend+0x64 > #13 0xc0502888 at clnt_vc_call+0x560 > #14 0xc05009d8 at clnt_reconnect_call+0x170 > #15 0xc01e7b14 at newnfs_request+0xb20 > #16 0xc0230218 at nfscl_request+0x60 > #17 0xc020d9bc at nfsrpc_getattr+0xb0 > Fatal kernel mode data abort: 'Alignment Fault' on read > trapframe: 0xdf1f1c90 > FSR=00000001, FAR=d7840264, spsr=40000013 > r0 =6a228eda, r1 =dac0d785, r2 =d7840264, r3 =db5527c0 > r4 =df1f1e00, r5 =dac0d75f, r6 =00000018, r7 =d9422c00 > r8 =c093e5e4, r9 =00000001, r10=df1f1f5c, r11=df1f1d38 > r12=e3098dd0, ssp=df1f1d20, slr=e3083bdc, pc =e3083c10 > > The commit you point at is entirely unrelated to the code where the panic occurred, so I’m pretty sure something went wrong in your bisect. The backtrace would suggest the issue occurs in the pf_syncookie_validate() function, and likely in the line `if (atomic_load_64(&V_pf_status.syncookies_inflight[cookie.flags.oddeven]) == 0)` The obvious way for that to panic would be to call it without the curvnet context set, but pf_test() uses it earlier, so that’s going to be fine. Given that this is unique to armv7 I’d recommend talking to the armv7 maintainer about 64 bit atomic operations. You can probably avoid the atomic load with this patch (and not enabling syncookie support): diff --git a/sys/netpfil/pf/pf_syncookies.c b/sys/netpfil/pf/pf_syncookies.c index 5230502be30c..c86d469d3cef 100644 --- a/sys/netpfil/pf/pf_syncookies.c +++ b/sys/netpfil/pf/pf_syncookies.c @@ -313,6 +313,9 @@ pf_syncookie_validate(struct pf_pdesc *pd) ack = ntohl(pd->hdr.tcp.th_ack) - 1; cookie.cookie = (ack & 0xff) ^ (ack >> 24); + if (V_pf_status.syncookies_mode == PF_SYNCOOKIES_NEVER) + return (0); + /* we don't know oddeven before setting the cookie (union) */ if (atomic_load_64(&V_pf_status.syncookies_inflight[cookie.flags.oddeven]) == 0) That shouldn’t be required though. Br, Kristof --=_MailMate_43ACE16C-8FBB-4703-A9C0-5E8ED6BC4AE9_= Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On 1 May 2022, at 5:13, qroxana wrote:

After git bisecting the panic start= ed since this commit.

commit 78bc3d5e1712bc1649aa5574d2b8d153f9665113

Author: Kristof Provost <
kp@FreeBSD.org

Date: Mon Feb 14 20:09:54 2022 +0100

vlan: allow net.link.vlan.mtag_pcp to be set per vnet

=

The primary reason for this change is to facilitate testi= ng.

MFC after: 1 week

sys/net/if_ethersubr.c | 9 +++++----

sys/net/if_vlan.c | 5 +++--

2 files changed, 8 insertions(+), 6 deletions(-)

The armv7 board boots from a NFS root,

it can boot without any problem if PF is disabled.

Any helps?

add host ::1: gateway lo0 fib 0: route already in table
add net fe80::: gateway ::1
add net ff02::: gateway ::1
add net ::ffff:0.0.0.0: gateway ::1
add net ::0.0.0.0: gateway ::1
Enabling pf.
Kernel page fault with the following non-sleepable locks held:
shared rm pf rulesets (pf rulesets) r =3D 0 (0xe3099430) locked @ /usr/sr= c/sys/netpfil/pf/pf.c:6493
exclusive rw tcpinp (tcpinp) r =3D 0 (0xdb748d88) locked @ /usr/src/sys/n= etinet/tcp_usrreq.c:1008
stack backtrace:
#0 0xc0355cac at witness_debugger+0x7c
#1 0xc0356ef0 at witness_warn+0x3fc
#2 0xc05ec048 at abort_handler+0x1d8
#3 0xc05cb5ac at exception_exit+0
#4 0xe3083c10 at pf_syncookie_validate+0x60
#5 0xe30496a8 at pf_test+0x518
#6 0xe306d768 at pf_check_out+0x30
#7 0xc0415b44 at pfil_run_hooks+0xbc
#8 0xc0445cfc at ip_output+0xce8
#9 0xc045bc9c at tcp_default_output+0x20ac
#10 0xc0471eb4 at tcp_usr_send+0x1ac
#11 0xc0389464 at sosend_generic+0x490
#12 0xc0389790 at sosend+0x64
#13 0xc0502888 at clnt_vc_call+0x560
#14 0xc05009d8 at clnt_reconnect_call+0x170
#15 0xc01e7b14 at newnfs_request+0xb20
#16 0xc0230218 at nfscl_request+0x60
#17 0xc020d9bc at nfsrpc_getattr+0xb0
Fatal kernel mode data abort: 'Alignment Fault' on read
trapframe: 0xdf1f1c90
FSR=3D00000001, FAR=3Dd7840264, spsr=3D40000013
r0 =3D6a228eda, r1 =3Ddac0d785, r2 =3Dd7840264, r3 =3Ddb5527c0
r4 =3Ddf1f1e00, r5 =3Ddac0d75f, r6 =3D00000018, r7 =3Dd9422c00
r8 =3Dc093e5e4, r9 =3D00000001, r10=3Ddf1f1f5c, r11=3Ddf1f1d38
r12=3De3098dd0, ssp=3Ddf1f1d20, slr=3De3083bdc, pc =3De3083c10


The commit you point at is entirely unrelated to the code= where the panic occurred, so I=E2=80=99m pretty sure something went wron= g in your bisect.

The backtrace would suggest the issue occurs in the pf_s= yncookie_validate() function, and likely in the line if (atomic_loa= d_64(&V_pf_status.syncookies_inflight[cookie.flags.oddeven]) =3D=3D 0= )

The obvious way for that to panic would be to call it wit= hout the curvnet context set, but pf_test() uses it earlier, so that=E2=80= =99s going to be fine.

Given that this is unique to armv7 I=E2=80=99d recommend = talking to the armv7 maintainer about 64 bit atomic operations.

You can probably avoid the atomic load with this patch (a= nd not enabling syncookie support):

diff --git a/sys/netpfil/pf/pf_syncookies.c b/sys/netpfil/=
pf/pf_syncookies.c
index 5230502be30c..c86d469d3cef 100644
--- a/sys/netpfil/pf/pf_syncookies.c
+++ b/sys/netpfil/pf/pf_syncookies.c
@@ -313,6 +313,9 @@ pf_syncookie_validate(struct pf_pdesc *pd)
        ack =3D ntohl(pd->hdr.tcp.th_ack) - 1;
        cookie.cookie =3D (ack & 0xff) ^ (ack >> 24);

+       if (V_pf_status.syncookies_mode =3D=3D PF_SYNCOOKIES_NEVER)
+               return (0);
+
        /* we don't know oddeven before setting the cookie (union) */
         if (atomic_load_64(&V_pf_status.syncookies_inflight[cookie.f=
lags.oddeven])
            =3D=3D 0)

That shouldn=E2=80=99t be required though.

Br,
Kristof

--=_MailMate_43ACE16C-8FBB-4703-A9C0-5E8ED6BC4AE9_=--