Re: A native armv7 panic during kyua runs: sys/netinet6/exthdr:exthdr -> Fatal kernel mode data abort: 'Alignment Fault' on read
- Reply: Mark Millard : "Re: A native armv7 panic during kyua runs: sys/netinet6/exthdr:exthdr -> Fatal kernel mode data abort: 'Alignment Fault' on read"
- In reply to: Michal Meloun : "Re: A native armv7 panic during kyua runs: sys/netinet6/exthdr:exthdr -> Fatal kernel mode data abort: 'Alignment Fault' on read"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 05 Aug 2023 21:04:41 UTC
On Aug 5, 2023, at 11:27, Michal Meloun <meloun.michal@gmail.com> wrote: > Hi Mark, > can you please try a this patch? > https://github.com/strejda/tegra/commit/bd4390c5f6a8b66b2fc83966d4fadb945a19dc23 I'll take a stab at testing it. But I'll note that description of the patch is somewhat odd: QUOTE Pack IP structures directly used for access packet data. All structures used to access data in byte buffers shall be marked as packed. Otherwise, this is undefined behavior - formally on every platform. END QUOTE __packed (and whatever it might be a macro for) is not part of any vintage of the C standard, not even as explicitly implementation defined nor as explicitly undefined. (C23's "attribute specifier sequence" notation use would give an implementation defined status as an understand, but not via explicit identification of the concept of packed in the standard.) As far as the language is concerned, there is no guarantee that a code generator will ensure to break things up into aligned accesses with assembly of the overall value if the members are not aligned in the first place, __packed or not. Nor does the language guarantee pack of padding in the layout for __packed. Past that, it is toolchain specific if __packed would avoid unaligned accesses for simple member access notation bytes yet also avoid having pad bytes. We will see for this context. (My history suggests a lack of overall uniformity in the interpretations given to declaring struct's as packed --or analogous wording for other languages that are not explicit about it.) > I'm sorry, but I don't have the time or energy to fully test it... I only hope the actual patch is much easier than the one listed in PR271759. > On 05.08.2023 8:11, Mark Millard wrote: >> On Aug 4, 2023, at 20:58, Warner Losh <imp@bsdimp.com> wrote: >>> It might make sense to work up a patch that skips this test on armv7 after filing a bug (the usual way).... >>> >>> Warner >> Actually, looking at the backtrace, it seems I've previously >> listed the same sort of backtrace structure in: >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271759 >> comment 12. Hans Petter Selasky had been working on that >> bugzilla entry. I'll add a note that this time I got it >> with the built-in EtherNet instead of the dongle used >> previously --and that sys/netinet6/exthdr:exthdr is a >> way of producing the panic. [Done.] >> In /usr/main-src/tests/sys/netinet6/exthdr.sh , commenting >> out one line would disable the specific test (leading >> whitespace might not be preserved below): >> atf_init_test_cases() >> { >> # atf_add_test_case "exthdr" >> } >> [FYI: All my kyua activity has been for FreeBSD main, >> generally targeting contexts with some armv7 code >> involved. It is associated with my having been an >> tester of early lib32 drafts.] >> I already have another commented out line for an armv7 >> panic (leading whitespace might not be preserved): >> # git -C /usr/main-src/ diff tests/sys/net/ >> diff --git a/tests/sys/net/if_bridge_test.sh b/tests/sys/net/if_bridge_test.sh >> index eb3a792df449..dcdac75103cd 100755 >> --- a/tests/sys/net/if_bridge_test.sh >> +++ b/tests/sys/net/if_bridge_test.sh >> @@ -675,7 +675,7 @@ atf_init_test_cases() >> atf_add_test_case "delete_with_members" >> atf_add_test_case "mac_conflict" >> atf_add_test_case "stp_validation" >> - atf_add_test_case "gif" >> +# atf_add_test_case "gif" >> atf_add_test_case "mtu" >> atf_add_test_case "vlan" >> } >> In the original discovery, having if_bridge.ko already loaded was >> important to getting the "gif" panic. >> But I've not yet put effort into isolating a cleaner/simpler test >> than I got the failure with. Nor have a done a range of comparisons >> of differing contexts yet. >> There are other armv7 related issues, one in particular >> being: >> A) All the long timeouts [300s+] are for *.py style tests. (Lots of >> these.) >> B) All the *.py style tests that do not have long timeout have one of: >> -> skipped: comment me to run the test >> -> skipped: Current architecture 'armv7' not supported >> __test_cases_list__ -> broken: Test program did not exit cleanly >> __test_cases_list__ -> broken: Test case list wrote to stderr >> The are about 10 of the "comment me" ones and 1 each of the other >> (B) ones, if I remember right. >> In other words, basically all the *.py based tests are broken or >> skipped as kyua classifies things. >> I've no clue yet if (A) is tied to the ports': >> cryptography/hazmat/bindings/_openssl.abi3.so >> openssl 3 incompatibility or not. But I've only seen the >> issue in armv7 contexts so far. >> I've spent time today on this issue but have made no progress >> on identifying what leads to the kdump/truss-output being as >> it is. >> If the *.py tests were working, I'd not be surprised to then >> find more armv7 panics than is now possible via the kyua tests. >>> On Fri, Aug 4, 2023 at 12:59 AM Mark Millard <marklmi@yahoo.com> wrote: >>> While discovered via an attempted overall kyua run, the following is >>> sufficient to get the crash in my native armv7 context: >>> >>> # /usr/bin/kyua test -k /usr/tests/Kyuafile sys/netinet6/exthdr:exthdr >>> sys/netinet6/exthdr:exthdr -> Fatal kernel mode data abort: 'Alignment Fault' on read >>> trapframe: 0xdfb97aa0 >>> FSR=00000001, FAR=db43ab76, spsr=60000013 >>> r0 =dfedd000, r1 =dfb97b34, r2 =00000000, r3 =00000000 >>> r4 =00000000, r5 =00000000, r6 =db43ab76, r7 =db43ab66 >>> r8 =c096383c, r9 =00000000, r10=db132400, r11=dfb97b60 >>> r12=00000000, ssp=dfb97b30, slr=c0b4e2c0, pc =c04e6b70 >>> >>> panic: Fatal abort >>> cpuid = 0 >>> time = 1691131498 >>> KDB: stack backtrace: >>> db_trace_self() at db_trace_self >>> pc = 0xc065f414 lr = 0xc007db80 (db_trace_self_wrapper+0x30) >>> sp = 0xdfb97858 fp = 0xdfb97970 >>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30 >>> pc = 0xc007db80 lr = 0xc031a834 (vpanic+0x140) >>> sp = 0xdfb97978 fp = 0xdfb97998 >>> r4 = 0x00000100 r5 = 0x00000000 >>> r6 = 0xc07c369a r7 = 0xc0b32e58 >>> vpanic() at vpanic+0x140 >>> pc = 0xc031a834 lr = 0xc031a6f4 (vpanic) >>> sp = 0xdfb979a0 fp = 0xdfb979a4 >>> r4 = 0xdfb97aa0 r5 = 0x00000013 >>> r6 = 0xdb43ab76 r7 = 0x00000001 >>> r8 = 0x00000001 r9 = 0xdfedd000 >>> r10 = 0xdb43ab76 >>> vpanic() at vpanic >>> pc = 0xc031a6f4 lr = 0xc06849dc (abort_align) >>> sp = 0xdfb979ac fp = 0xdfb979d8 >>> r4 = 0x00000001 r5 = 0x00000001 >>> r6 = 0xdfedd000 r7 = 0xdb43ab76 >>> r8 = 0xdfb979a4 r9 = 0xc031a6f4 >>> r10 = 0xdfb979ac >>> abort_align() at abort_align >>> pc = 0xc06849dc lr = 0xc0684a50 (abort_align+0x74) >>> sp = 0xdfb979e0 fp = 0xdfb979f8 >>> r4 = 0x00000013 r10 = 0xdb43ab76 >>> abort_align() at abort_align+0x74 >>> pc = 0xc0684a50 lr = 0xc06846a8 (abort_handler+0x45c) >>> sp = 0xdfb97a00 fp = 0xdfb97a98 >>> r4 = 0x00000000 r10 = 0xdb43ab76 >>> abort_handler() at abort_handler+0x45c >>> pc = 0xc06846a8 lr = 0xc0661cc8 (exception_exit) >>> sp = 0xdfb97aa0 fp = 0xdfb97b60 >>> r4 = 0x00000000 r5 = 0x00000000 >>> r6 = 0xdb43ab76 r7 = 0xdb43ab66 >>> r8 = 0xc096383c r9 = 0x00000000 >>> r10 = 0xdb132400 >>> exception_exit() at exception_exit >>> pc = 0xc0661cc8 lr = 0xc0b4e2c0 (__pcpu) >>> sp = 0xdfb97b30 fp = 0xdfb97b60 >>> r0 = 0xdfedd000 r1 = 0xdfb97b34 >>> r2 = 0x00000000 r3 = 0x00000000 >>> r4 = 0x00000000 r5 = 0x00000000 >>> r6 = 0xdb43ab76 r7 = 0xdb43ab66 >>> r8 = 0xc096383c r9 = 0x00000000 >>> r10 = 0xdb132400 r12 = 0x00000000 >>> in6ifa_ifwithaddr() at in6ifa_ifwithaddr+0x30 >>> pc = 0xc04e6b70 lr = 0xc04f9030 (ip6_input+0xd38) >>> sp = 0xdfb97b68 fp = 0xdfb97c28 >>> r4 = 0xdb43ab76 r5 = 0xdb43ab5e >>> r6 = 0x00000000 r7 = 0xdb43ab66 >>> ip6_input() at ip6_input+0xd38 >>> pc = 0xc04f9030 lr = 0xc046d66c (netisr_dispatch_src+0xf8) >>> sp = 0xdfb97c30 fp = 0xdfb97c58 >>> r4 = 0xdb43ab00 r5 = 0x00000006 >>> r6 = 0x00000007 r7 = 0xc0b49d50 >>> r8 = 0xdafea0c0 r9 = 0xdb43ab00 >>> r10 = 0x00000086 >>> netisr_dispatch_src() at netisr_dispatch_src+0xf8 >>> pc = 0xc046d66c lr = 0xc04641b0 (ether_demux+0x18c) >>> sp = 0xdfb97c60 fp = 0xdfb97c78 >>> r4 = 0x00000006 r5 = 0x00001201 >>> r6 = 0xdb132400 r7 = 0x000000ff >>> r8 = 0xdafea0c0 r9 = 0xdb43ab00 >>> r10 = 0x00000086 >>> ether_demux() at ether_demux+0x18c >>> pc = 0xc04641b0 lr = 0xc0465880 (ether_nh_input+0x490) >>> sp = 0xdfb97c80 fp = 0xdfb97ce0 >>> r4 = 0xdb132400 r5 = 0xdb43ab00 >>> r6 = 0xdb43ab50 r10 = 0x00000086 >>> ether_nh_input() at ether_nh_input+0x490 >>> pc = 0xc0465880 lr = 0xc046d66c (netisr_dispatch_src+0xf8) >>> sp = 0xdfb97ce8 fp = 0xdfb97d10 >>> r4 = 0xdb43ab00 r5 = 0x00000005 >>> r6 = 0x0000000c r7 = 0xc0b49d30 >>> r8 = 0xdafea0c0 r9 = 0xdb43ab00 >>> r10 = 0xc098d18f >>> netisr_dispatch_src() at netisr_dispatch_src+0xf8 >>> pc = 0xc046d66c lr = 0xc04645c4 (ether_input+0x50) >>> sp = 0xdfb97d18 fp = 0xdfb97d48 >>> r4 = 0xdb43ab00 r5 = 0x00000000 >>> r6 = 0x00008803 r7 = 0x00000000 >>> r8 = 0xdafea0c0 r9 = 0xdb43ab00 >>> r10 = 0xc098d18f >>> ether_input() at ether_input+0x50 >>> pc = 0xc04645c4 lr = 0xdffb3f08 ($a.10+0x108) >>> sp = 0xdfb97d50 fp = 0xdfb97d78 >>> r4 = 0xdb132400 r5 = 0xdaff8b00 >>> r6 = 0xdaff8b10 r7 = 0x00000000 >>> r8 = 0x00000000 r10 = 0xc098d18f >>> $a.10() at $a.10+0x108 >>> pc = 0xdffb3f08 lr = 0xc038cb2c (taskqueue_run_locked+0x1c4) >>> sp = 0xdfb97d80 fp = 0xdfb97dd8 >>> r4 = 0xe0145100 r5 = 0xdaff8b2c >>> r6 = 0xe0145150 r7 = 0x00000001 >>> r8 = 0x00000000 r9 = 0xdfb97d90 >>> r10 = 0x00000001 >>> taskqueue_run_locked() at taskqueue_run_locked+0x1c4 >>> pc = 0xc038cb2c lr = 0xc038e4e4 (taskqueue_thread_loop+0x1b0) >>> sp = 0xdfb97de0 fp = 0xdfb97e10 >>> r4 = 0xe0145100 r5 = 0xe0145140 >>> r6 = 0xc07af4c4 r7 = 0x00000000 >>> r8 = 0xc098d18f r9 = 0x00000100 >>> r10 = 0xc0b228a0 >>> taskqueue_thread_loop() at taskqueue_thread_loop+0x1b0 >>> pc = 0xc038e4e4 lr = 0xc02cdf0c (fork_exit+0xc0) >>> sp = 0xdfb97e18 fp = 0xdfb97e38 >>> r4 = 0xdfedd000 r5 = 0xc0b224e0 >>> r6 = 0xc038e334 r7 = 0xdffc4f54 >>> r8 = 0xdfb97e40 r9 = 0xc098d191 >>> fork_exit() at fork_exit+0xc0 >>> pc = 0xc02cdf0c lr = 0xc0661c5c (swi_exit) >>> sp = 0xdfb97e40 fp = 0x00000000 >>> r4 = 0xc038e334 r5 = 0xdffc4f54 >>> r6 = 0xc0b45d84 r7 = 0xd73bcba0 >>> r8 = 0x00000001 r10 = 0xc0b228a0 >>> swi_exit() at swi_exit >>> pc = 0xc0661c5c lr = 0xc0661c5c (swi_exit) >>> sp = 0xdfb97e40 fp = 0x00000000 >>> KDB: enter: panic >>> [ thread pid 0 tid 100230 ] >>> >>> For reference: >>> >>> # uname -apKU >>> FreeBSD OPiP2E-RPi2v1p1 14.0-CURRENT FreeBSD 14.0-CURRENT armv7 1400093 #6 main-n264334-215bab7924f6-dirty: Tue Jul 25 23:11:39 PDT 2023 root@CA72-16Gp-ZFS:/usr/obj/BUILDs/main-CA7-nodbg-clang/usr/main-src/arm.armv7/sys/GENERIC-NODBG-CA7 arm armv7 1400093 1400093 >>> >>> The OrangePi+ 2Ed was the type of system booted and tested. >>> === Mark Millard marklmi at yahoo.com