Re: A native armv7 panic during kyua runs: sys/netinet6/exthdr:exthdr -> Fatal kernel mode data abort: 'Alignment Fault' on read

From: Michal Meloun <meloun.michal_at_gmail.com>
Date: Sat, 05 Aug 2023 18:27:34 UTC
Hi Mark,
can you please try a this patch?
https://github.com/strejda/tegra/commit/bd4390c5f6a8b66b2fc83966d4fadb945a19dc23

I'm sorry, but I don't have the time or energy to fully test it... I 
only hope the actual patch is much easier than the one listed in PR271759.


Michal



On 05.08.2023 8:11, Mark Millard wrote:
> On Aug 4, 2023, at 20:58, Warner Losh <imp@bsdimp.com> wrote:
> 
>> It might make sense to work up a patch that skips this test on armv7 after filing a bug (the usual way)....
>>
>> Warner
> 
> Actually, looking at the backtrace, it seems I've previously
> listed the same sort of backtrace structure in:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271759
> 
> comment 12. Hans Petter Selasky had been working on that
> bugzilla entry. I'll add a note that this time I got it
> with the built-in EtherNet instead of the dongle used
> previously --and that sys/netinet6/exthdr:exthdr is a
> way of producing the panic. [Done.]
> 
> In /usr/main-src/tests/sys/netinet6/exthdr.sh , commenting
> out one line would disable the specific test (leading
> whitespace might not be preserved below):
> 
> atf_init_test_cases()
> {
> 
> #        atf_add_test_case "exthdr"
> }
> 
> [FYI: All my kyua activity has been for FreeBSD main,
> generally targeting contexts with some armv7 code
> involved. It is associated with my having been an
> tester of early lib32 drafts.]
> 
> I already have another commented out line for an armv7
> panic (leading whitespace might not be preserved):
> 
> # git -C /usr/main-src/ diff tests/sys/net/
> diff --git a/tests/sys/net/if_bridge_test.sh b/tests/sys/net/if_bridge_test.sh
> index eb3a792df449..dcdac75103cd 100755
> --- a/tests/sys/net/if_bridge_test.sh
> +++ b/tests/sys/net/if_bridge_test.sh
> @@ -675,7 +675,7 @@ atf_init_test_cases()
>          atf_add_test_case "delete_with_members"
>          atf_add_test_case "mac_conflict"
>          atf_add_test_case "stp_validation"
> -       atf_add_test_case "gif"
> +#      atf_add_test_case "gif"
>          atf_add_test_case "mtu"
>          atf_add_test_case "vlan"
>   }
> 
> In the original discovery, having if_bridge.ko already loaded was
> important to getting the "gif" panic.
> 
> But I've not yet put effort into isolating a cleaner/simpler test
> than I got the failure with. Nor have a done a range of comparisons
> of differing contexts yet.
> 
> There are other armv7 related issues, one in particular
> being:
> 
> A) All the long timeouts [300s+] are for *.py style tests. (Lots of
>     these.)
> 
> B) All the *.py style tests that do not have long timeout have one of:
> 
>   ->  skipped: comment me to run the test
>   ->  skipped: Current architecture 'armv7' not supported
> __test_cases_list__  ->  broken: Test program did not exit cleanly
> __test_cases_list__  ->  broken: Test case list wrote to stderr
> 
> The are about 10 of the "comment me" ones and 1 each of the other
> (B) ones, if I remember right.
> 
> In other words, basically all the *.py based tests are broken or
> skipped as kyua classifies things.
> 
> I've no clue yet if (A) is tied to the ports':
> 
> cryptography/hazmat/bindings/_openssl.abi3.so
> 
> openssl 3 incompatibility or not. But I've only seen the
> issue in armv7 contexts so far.
> 
> I've spent time today on this issue but have made no progress
> on identifying what leads to the kdump/truss-output being as
> it is.
> 
> If the *.py tests were working, I'd not be surprised to then
> find more armv7 panics than is now possible via the kyua tests.
> 
>> On Fri, Aug 4, 2023 at 12:59 AM Mark Millard <marklmi@yahoo.com> wrote:
>> While discovered via an attempted overall kyua run, the following is
>> sufficient to get the crash in my native armv7 context:
>>
>> # /usr/bin/kyua test -k /usr/tests/Kyuafile sys/netinet6/exthdr:exthdr
>> sys/netinet6/exthdr:exthdr  ->  Fatal kernel mode data abort: 'Alignment Fault' on read
>> trapframe: 0xdfb97aa0
>> FSR=00000001, FAR=db43ab76, spsr=60000013
>> r0 =dfedd000, r1 =dfb97b34, r2 =00000000, r3 =00000000
>> r4 =00000000, r5 =00000000, r6 =db43ab76, r7 =db43ab66
>> r8 =c096383c, r9 =00000000, r10=db132400, r11=dfb97b60
>> r12=00000000, ssp=dfb97b30, slr=c0b4e2c0, pc =c04e6b70
>>
>> panic: Fatal abort
>> cpuid = 0
>> time = 1691131498
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>>           pc = 0xc065f414  lr = 0xc007db80 (db_trace_self_wrapper+0x30)
>>           sp = 0xdfb97858  fp = 0xdfb97970
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>>           pc = 0xc007db80  lr = 0xc031a834 (vpanic+0x140)
>>           sp = 0xdfb97978  fp = 0xdfb97998
>>           r4 = 0x00000100  r5 = 0x00000000
>>           r6 = 0xc07c369a  r7 = 0xc0b32e58
>> vpanic() at vpanic+0x140
>>           pc = 0xc031a834  lr = 0xc031a6f4 (vpanic)
>>           sp = 0xdfb979a0  fp = 0xdfb979a4
>>           r4 = 0xdfb97aa0  r5 = 0x00000013
>>           r6 = 0xdb43ab76  r7 = 0x00000001
>>           r8 = 0x00000001  r9 = 0xdfedd000
>>          r10 = 0xdb43ab76
>> vpanic() at vpanic
>>           pc = 0xc031a6f4  lr = 0xc06849dc (abort_align)
>>           sp = 0xdfb979ac  fp = 0xdfb979d8
>>           r4 = 0x00000001  r5 = 0x00000001
>>           r6 = 0xdfedd000  r7 = 0xdb43ab76
>>           r8 = 0xdfb979a4  r9 = 0xc031a6f4
>>          r10 = 0xdfb979ac
>> abort_align() at abort_align
>>           pc = 0xc06849dc  lr = 0xc0684a50 (abort_align+0x74)
>>           sp = 0xdfb979e0  fp = 0xdfb979f8
>>           r4 = 0x00000013 r10 = 0xdb43ab76
>> abort_align() at abort_align+0x74
>>           pc = 0xc0684a50  lr = 0xc06846a8 (abort_handler+0x45c)
>>           sp = 0xdfb97a00  fp = 0xdfb97a98
>>           r4 = 0x00000000 r10 = 0xdb43ab76
>> abort_handler() at abort_handler+0x45c
>>           pc = 0xc06846a8  lr = 0xc0661cc8 (exception_exit)
>>           sp = 0xdfb97aa0  fp = 0xdfb97b60
>>           r4 = 0x00000000  r5 = 0x00000000
>>           r6 = 0xdb43ab76  r7 = 0xdb43ab66
>>           r8 = 0xc096383c  r9 = 0x00000000
>>          r10 = 0xdb132400
>> exception_exit() at exception_exit
>>           pc = 0xc0661cc8  lr = 0xc0b4e2c0 (__pcpu)
>>           sp = 0xdfb97b30  fp = 0xdfb97b60
>>           r0 = 0xdfedd000  r1 = 0xdfb97b34
>>           r2 = 0x00000000  r3 = 0x00000000
>>           r4 = 0x00000000  r5 = 0x00000000
>>           r6 = 0xdb43ab76  r7 = 0xdb43ab66
>>           r8 = 0xc096383c  r9 = 0x00000000
>>          r10 = 0xdb132400 r12 = 0x00000000
>> in6ifa_ifwithaddr() at in6ifa_ifwithaddr+0x30
>>           pc = 0xc04e6b70  lr = 0xc04f9030 (ip6_input+0xd38)
>>           sp = 0xdfb97b68  fp = 0xdfb97c28
>>           r4 = 0xdb43ab76  r5 = 0xdb43ab5e
>>           r6 = 0x00000000  r7 = 0xdb43ab66
>> ip6_input() at ip6_input+0xd38
>>           pc = 0xc04f9030  lr = 0xc046d66c (netisr_dispatch_src+0xf8)
>>           sp = 0xdfb97c30  fp = 0xdfb97c58
>>           r4 = 0xdb43ab00  r5 = 0x00000006
>>           r6 = 0x00000007  r7 = 0xc0b49d50
>>           r8 = 0xdafea0c0  r9 = 0xdb43ab00
>>          r10 = 0x00000086
>> netisr_dispatch_src() at netisr_dispatch_src+0xf8
>>           pc = 0xc046d66c  lr = 0xc04641b0 (ether_demux+0x18c)
>>           sp = 0xdfb97c60  fp = 0xdfb97c78
>>           r4 = 0x00000006  r5 = 0x00001201
>>           r6 = 0xdb132400  r7 = 0x000000ff
>>           r8 = 0xdafea0c0  r9 = 0xdb43ab00
>>          r10 = 0x00000086
>> ether_demux() at ether_demux+0x18c
>>           pc = 0xc04641b0  lr = 0xc0465880 (ether_nh_input+0x490)
>>           sp = 0xdfb97c80  fp = 0xdfb97ce0
>>           r4 = 0xdb132400  r5 = 0xdb43ab00
>>           r6 = 0xdb43ab50 r10 = 0x00000086
>> ether_nh_input() at ether_nh_input+0x490
>>           pc = 0xc0465880  lr = 0xc046d66c (netisr_dispatch_src+0xf8)
>>           sp = 0xdfb97ce8  fp = 0xdfb97d10
>>           r4 = 0xdb43ab00  r5 = 0x00000005
>>           r6 = 0x0000000c  r7 = 0xc0b49d30
>>           r8 = 0xdafea0c0  r9 = 0xdb43ab00
>>          r10 = 0xc098d18f
>> netisr_dispatch_src() at netisr_dispatch_src+0xf8
>>           pc = 0xc046d66c  lr = 0xc04645c4 (ether_input+0x50)
>>           sp = 0xdfb97d18  fp = 0xdfb97d48
>>           r4 = 0xdb43ab00  r5 = 0x00000000
>>           r6 = 0x00008803  r7 = 0x00000000
>>           r8 = 0xdafea0c0  r9 = 0xdb43ab00
>>          r10 = 0xc098d18f
>> ether_input() at ether_input+0x50
>>           pc = 0xc04645c4  lr = 0xdffb3f08 ($a.10+0x108)
>>           sp = 0xdfb97d50  fp = 0xdfb97d78
>>           r4 = 0xdb132400  r5 = 0xdaff8b00
>>           r6 = 0xdaff8b10  r7 = 0x00000000
>>           r8 = 0x00000000 r10 = 0xc098d18f
>> $a.10() at $a.10+0x108
>>           pc = 0xdffb3f08  lr = 0xc038cb2c (taskqueue_run_locked+0x1c4)
>>           sp = 0xdfb97d80  fp = 0xdfb97dd8
>>           r4 = 0xe0145100  r5 = 0xdaff8b2c
>>           r6 = 0xe0145150  r7 = 0x00000001
>>           r8 = 0x00000000  r9 = 0xdfb97d90
>>          r10 = 0x00000001
>> taskqueue_run_locked() at taskqueue_run_locked+0x1c4
>>           pc = 0xc038cb2c  lr = 0xc038e4e4 (taskqueue_thread_loop+0x1b0)
>>           sp = 0xdfb97de0  fp = 0xdfb97e10
>>           r4 = 0xe0145100  r5 = 0xe0145140
>>           r6 = 0xc07af4c4  r7 = 0x00000000
>>           r8 = 0xc098d18f  r9 = 0x00000100
>>          r10 = 0xc0b228a0
>> taskqueue_thread_loop() at taskqueue_thread_loop+0x1b0
>>           pc = 0xc038e4e4  lr = 0xc02cdf0c (fork_exit+0xc0)
>>           sp = 0xdfb97e18  fp = 0xdfb97e38
>>           r4 = 0xdfedd000  r5 = 0xc0b224e0
>>           r6 = 0xc038e334  r7 = 0xdffc4f54
>>           r8 = 0xdfb97e40  r9 = 0xc098d191
>> fork_exit() at fork_exit+0xc0
>>           pc = 0xc02cdf0c  lr = 0xc0661c5c (swi_exit)
>>           sp = 0xdfb97e40  fp = 0x00000000
>>           r4 = 0xc038e334  r5 = 0xdffc4f54
>>           r6 = 0xc0b45d84  r7 = 0xd73bcba0
>>           r8 = 0x00000001 r10 = 0xc0b228a0
>> swi_exit() at swi_exit
>>           pc = 0xc0661c5c  lr = 0xc0661c5c (swi_exit)
>>           sp = 0xdfb97e40  fp = 0x00000000
>> KDB: enter: panic
>> [ thread pid 0 tid 100230 ]
>>
>> For reference:
>>
>> # uname -apKU
>> FreeBSD OPiP2E-RPi2v1p1 14.0-CURRENT FreeBSD 14.0-CURRENT armv7 1400093 #6 main-n264334-215bab7924f6-dirty: Tue Jul 25 23:11:39 PDT 2023     root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7 arm armv7 1400093 1400093
>>
>> The OrangePi+ 2Ed was the type of system booted and tested.
>>
> 
> 
> ===
> Mark Millard
> marklmi at yahoo.com
> 
>