Re: Kernel panic on armv7 when PF is enabled

From: qroxana <qroxana_at_protonmail.com>
Date: Tue, 03 May 2022 12:36:07 UTC
On Monday, May 2nd, 2022 at 9:02 AM, Kristof Provost <kp@FreeBSD.org> wrote:


> On 1 May 2022, at 5:13, qroxana wrote:
>
> > After git bisecting the panic started since this commit.
> >
> > commit 78bc3d5e1712bc1649aa5574d2b8d153f9665113
> >
> > Author: Kristof Provost <
> > kp@FreeBSD.org
> >
> > Date: Mon Feb 14 20:09:54 2022 +0100
> >
> > vlan: allow net.link.vlan.mtag_pcp to be set per vnet
> >
> > The primary reason for this change is to facilitate testing.
> >
> > MFC after: 1 week
> >
> > sys/net/if_ethersubr.c | 9 +++++----
> >
> > sys/net/if_vlan.c | 5 +++--
> >
> > 2 files changed, 8 insertions(+), 6 deletions(-)
> >
> > The armv7 board boots from a NFS root,
> >
> > it can boot without any problem if PF is disabled.
> >
> > Any helps?
> >
> > add host ::1: gateway lo0 fib 0: route already in table
> > add net fe80::: gateway ::1
> > add net ff02::: gateway ::1
> > add net ::ffff:0.0.0.0: gateway ::1
> > add net ::0.0.0.0: gateway ::1
> > Enabling pf.
> > Kernel page fault with the following non-sleepable locks held:
> > shared rm pf rulesets (pf rulesets) r = 0 (0xe3099430) locked @ /usr/src/sys/netpfil/pf/pf.c:6493
> > exclusive rw tcpinp (tcpinp) r = 0 (0xdb748d88) locked @ /usr/src/sys/netinet/tcp_usrreq.c:1008
> > stack backtrace:
> > #0 0xc0355cac at witness_debugger+0x7c
> > #1 0xc0356ef0 at witness_warn+0x3fc
> > #2 0xc05ec048 at abort_handler+0x1d8
> > #3 0xc05cb5ac at exception_exit+0
> > #4 0xe3083c10 at pf_syncookie_validate+0x60
> > #5 0xe30496a8 at pf_test+0x518
> > #6 0xe306d768 at pf_check_out+0x30
> > #7 0xc0415b44 at pfil_run_hooks+0xbc
> > #8 0xc0445cfc at ip_output+0xce8
> > #9 0xc045bc9c at tcp_default_output+0x20ac
> > #10 0xc0471eb4 at tcp_usr_send+0x1ac
> > #11 0xc0389464 at sosend_generic+0x490
> > #12 0xc0389790 at sosend+0x64
> > #13 0xc0502888 at clnt_vc_call+0x560
> > #14 0xc05009d8 at clnt_reconnect_call+0x170
> > #15 0xc01e7b14 at newnfs_request+0xb20
> > #16 0xc0230218 at nfscl_request+0x60
> > #17 0xc020d9bc at nfsrpc_getattr+0xb0
> > Fatal kernel mode data abort: 'Alignment Fault' on read
> > trapframe: 0xdf1f1c90
> > FSR=00000001, FAR=d7840264, spsr=40000013
> > r0 =6a228eda, r1 =dac0d785, r2 =d7840264, r3 =db5527c0
> > r4 =df1f1e00, r5 =dac0d75f, r6 =00000018, r7 =d9422c00
> > r8 =c093e5e4, r9 =00000001, r10=df1f1f5c, r11=df1f1d38
> > r12=e3098dd0, ssp=df1f1d20, slr=e3083bdc, pc =e3083c10
>
> The commit you point at is entirely unrelated to the code where the panic occurred, so I’m pretty sure something went wrong in your bisect.
>
> The backtrace would suggest the issue occurs in the pf_syncookie_validate() function, and likely in the line `if (atomic_load_64(&V_pf_status.syncookies_inflight[cookie.flags.oddeven]) == 0)`
>
> The obvious way for that to panic would be to call it without the curvnet context set, but pf_test() uses it earlier, so that’s going to be fine.
>
> Given that this is unique to armv7 I’d recommend talking to the armv7 maintainer about 64 bit atomic operations.
>
> You can probably avoid the atomic load with this patch (and not enabling syncookie support):
>
>     diff --git a/sys/netpfil/pf/pf_syncookies.c b/sys/netpfil/pf/pf_syncookies.c
>     index 5230502be30c..c86d469d3cef 100644
>     --- a/sys/netpfil/pf/pf_syncookies.c
>     +++ b/sys/netpfil/pf/pf_syncookies.c
>     @@ -313,6 +313,9 @@ pf_syncookie_validate(struct pf_pdesc *pd)
>             ack = ntohl(pd->hdr.tcp.th_ack) - 1;
>             cookie.cookie = (ack & 0xff) ^ (ack >> 24);
>
>     +       if (V_pf_status.syncookies_mode == PF_SYNCOOKIES_NEVER)
>     +               return (0);
>     +
>             /* we don't know oddeven before setting the cookie (union) */
>              if (atomic_load_64(&V_pf_status.syncookies_inflight[cookie.flags.oddeven])
>                 == 0)
>
>
> That shouldn’t be required though.
>
> Br,
> Kristof

Thank you sir. You were right.
I tested patch with the latest kernel.
It can boot successfully with the patch,
and still got kernel panic without the patch.