Re: Official armv7 PkgBase kernel-NODEBUG installation's USB2 boot gets "Fatal kernel mode data abort: 'Alignment Fault' on write" very early, at least on an OrangePi+ 2ed

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 05 Dec 2024 02:35:24 UTC

On Dec 2, 2024, at 23:38, Mark Millard <marklmi@yahoo.com> wrote:

> Top post of identifying a new context:
> 
> Now that stable/14 is based on LLVM19, stable/14 is broken
> like main [so: 15] was, at least in part.

While I've not tested stable/13 for the failure, it also got
an update to use LLVM 19 . So likely both of stable/1[34]
need updates.

> Some MFC activity looks to be required in order to boot armv7
> via stable/14 now. More may be required.

Looks to me like the following (that have a mix of MFC wait
times: 1 month/4 weeks vs. 1 week) spans the issues in main
(and somewhat more: some VFP state corruption issues).

* arm: Fix VFP state corruption during signal delivery Michal Meloun 9 days 1 -18/+24
* arm: link all .rodata variants into one output section Michal Meloun 2024-11-17 1 -1/+1
* arm: align data section to the supersection. Michal Meloun 2024-11-17 1 -3/+1
* arm: add read_frequently, read_mostly and exclusive_cache_line sections to li... Michal Meloun 2024-11-17 1 -0/+15
* arm: fix symbols around the .ARM.exidx section Michal Meloun 2024-11-17 1 -0/+1
* arm: Fix typo in ldscript.arm. Michal Meloun 2024-11-17 1 -1/+1
* arm: switch the BUSDMA buffers to normal uncached memory Michal Meloun 2024-11-11 1 -1/+1

When I looked, it did not seem that any of the 1-week MFCs
involved made it into a stable/1[34] yet.

> The failure looks like:
> 
> . . .
> mmc1: <MMC/SD bus> on aw_mmc1
> mmc1: No compatible cards found on bus
> aw_mmc1: Spurious interrupt - no active request, rint: 0x00000004
> 
> mmc2: <MMC/SD bus> on aw_mmc0
> mmcsd1: 32GB <SDHC SL32G 8.0 SN 006A919A MFG 02/2015 by 3 SD> at mmc2 50.0MHz/4bit/32768-block
> mmc2: Failed to set VCCQ for card at relative address 43690
> uhub0: 1 port with 1 removable, self powered
> uhub2: 1 port with 1 removable, self powered
> uhub5: 1 port with 1 removable, self powered
> uhub8: 1 port with 1 removable, self powered
> Root mount waiting for: usbus3
> Fatal kernel mode data abort: 'Alignment Fault' on write
> trapframe: 0xc6b7dc10
> FSR=00000801, FAR=db0b901b, spsr=20000013
> r0 =db0b9000, r1 =00000000, r2 =00000006, r3 =00000024
> r4 =db058c80, r5 =00000000, r6 =00000001, r7 =00000006
> r8 =c6b7dd20, r9 =c0b324fc, r10=c08ef8dc, r11=c6b7dcb8
> r12=00000000, ssp=c6b7dca0, slr=c019f774, pc =c019f524
> 
> panic: Fatal abort
> cpuid = 1
> time = 3
> KDB: stack backtrace:
> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> trapframe: 0xc6b7d970
> FSR=00000005, FAR=a3e89ab0, spsr=200001d3
> r0 =c6b7da24, r1 =00000001, r2 =a3e89aad, r3 =63622c6d
> r4 =c0866e44, r5 =f3bea0b1, r6 =0000a776, r7 =81000000
> r8 =c0813294, r9 =c0b229c4, r10=c6b7db1c, r11=c6b7da18
> r12=c6b7dad8, ssp=c6b7da00, slr=c0665720, pc =c066974c
> panic: Fatal abort
> cpuid = 1
> time = 3
> KDB: stack backtrace:
> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> trapframe: 0xc6b7d6f0
> FSR=00000005, FAR=a3e89ab0, spsr=200001d3
> r0 =c6b7d7a4, r1 =00000001, r2 =a3e89aad, r3 =63622c6d
> r4 =c0866e44, r5 =f3bea0b1, r6 =0000a776, r7 =81000000
> r8 =c0813294, r9 =c0b229c4, r10=c6b7d89c, r11=c6b7d798
> r12=c6b7d858, ssp=c6b7d780, slr=c0665720, pc =c066974c
> 
> panic: Fatal abort
> cpuid = 1
> time = 3
> KDB: stack backtrace:
> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> trapframe: 0xc6b7d470
> FSR=00000005, FAR=a3e89ab0, spsr=200001d3
> r0 =c6b7d524, r1 =00000001, r2 =a3e89aad, r3 =63622c6d
> r4 =c0866e44, r5 =f3bea0b1, r6 =0000a776, r7 =81000000
> r8 =c0813294, r9 =c0b229c4, r10=c6b7d61c, r11=c6b7d518
> r12=c6b7d5d8, ssp=c6b7d500, slr=c0665720, pc =c066974c
> 
> 
> After that the L1 translation fault repeats over and over.
> 
> 
> 
> On Nov 8, 2024, at 04:49, Michal Meloun <mmel@freebsd.org> wrote:
> 
>> On 08.11.2024 4:15, Mark Millard wrote:
>>> [I narrowed the artifact kernel range for the change in the type of
>>> failure that happens.]
>>> On Nov 7, 2024, at 17:43, Mark Millard <marklmi@yahoo.com> wrote:
>>>> [The change to LLVM 19 is what leads to the Alignment
>>>> Fault' on write failure. Details later below.]
>>>> 
>>>> On Nov 7, 2024, at 01:42, Mark Millard <marklmi@yahoo.com> wrote:
>>>> 
>>>>> Note: Unfortunately, the panics here are too early for a
>>>>> dump device to be available.
>>>>> 
>>>>> Context started PkgBase upgrade from:
>>>>> 
>>>>> # uname -apKU
>>>>> FreeBSD OPiP2E-RPi2v1p1 15.0-CURRENT FreeBSD 15.0-CURRENT main-n272821-37798b1d5dd1 GENERIC-NODEBUG arm armv7 1500025 1500025
>>>>> 
>>>>> Installed packages to be UPGRADED:
>>>>>      FreeBSD-dtb: 15.snap20241009161500 -> 15.snap20241028121139 [base]
>>>>>      FreeBSD-kernel-generic: 15.snap20241011221604 -> 15.snap20241106134422 [base]
>>>>>      FreeBSD-kernel-generic-dbg: 15.snap20241011221604 -> 15.snap20241106134422 [base]
>>>>>      FreeBSD-kernel-generic-mmccam: 15.snap20241011221604 -> 15.snap20241106134422 [base]
>>>>>      FreeBSD-kernel-generic-mmccam-dbg: 15.snap20241011221604 -> 15.snap20241106134422 [base]
>>>>>      FreeBSD-kernel-generic-nodebug: 15.snap20241011221604 -> 15.snap20241106134422 [base]
>>>>>      FreeBSD-kernel-generic-nodebug-dbg: 15.snap20241011221604 -> 15.snap20241106134422 [base]
>>>>>      FreeBSD-src-sys: 15.snap20241011221604 -> 15.snap20241106160110 [base]
>>>>> 
>>>>> (Those were installed but the FreeBSD-dtb had linux 6.4
>>>>> dtb files, not the 6.8 ones. 6.8 ones from a personal build
>>>>> were copied to where they need to be. I've separately
>>>>> reported the 6.4 vs. 6.8 issue.)
>>>>> 
>>>>> # ~/pkgbase-snapshot-list.sh
>>>>> Via pkg-static info -C -x '^FreeBSD-' . . .
>>>>> 1 FreeBSD-*-15.snap20241106160110
>>>>> 6 FreeBSD-*-15.snap20241106134422
>>>>> 1 FreeBSD-*-15.snap20241028121139
>>>>> 3 FreeBSD-*-15.snap20241011221604
>>>>> 2 FreeBSD-*-15.snap20241011210446
>>>>> 38 FreeBSD-*-15.snap20241011182434
>>>>> 4 FreeBSD-*-15.snap20241011073851
>>>>> 5 FreeBSD-*-15.snap20241010141501
>>>>> 1 FreeBSD-*-15.snap20241010120743
>>>>> 296 FreeBSD-*-15.snap20241009161500
>>>>> Instead via /var/cache/pkg/*.snap*.pkg . . .
>>>>> 1 FreeBSD-*-15.snap20241106160110
>>>>> 6 FreeBSD-*-15.snap20241106134422
>>>>> 1 FreeBSD-*-15.snap20241028121139
>>>>> 10 FreeBSD-*-15.snap20241011221604
>>>>> 2 FreeBSD-*-15.snap20241011210446
>>>>> 38 FreeBSD-*-15.snap20241011182434
>>>>> 4 FreeBSD-*-15.snap20241011073851
>>>>> 5 FreeBSD-*-15.snap20241010141501
>>>>> 1 FreeBSD-*-15.snap20241010120743
>>>>> 297 FreeBSD-*-15.snap20241009161500
>>>>> 
>>>>> 
>>>>> The failure (kernel-GENERIC-NODEBUG):
>>>>> 
>>>>> . . .
>>>>> Root mount waiting for: usbus3 CAM
>>>>> Fatal kernel mode data abort: 'Alignment Fault' on write
>>>>> trapframe: 0xc6c9ac10
>>>>> FSR=00000801, FAR=db23209b, spsr=20000013
>>>>> r0 =db232080, r1 =00000000, r2 =00000006, r3 =00000024
>>>>> r4 =db19e280, r5 =00000000, r6 =00000001, r7 =00000006
>>>>> r8 =c6c9ad20, r9 =c0b7973c, r10=c092074c, r11=c6c9acb8
>>>>> r12=00000000, ssp=c6c9aca0, slr=c01b01d8, pc =c01aff88
>>>>> 
>>>>> panic: Fatal abort
>>>>> cpuid = 1
>>>>> time = 3
>>>>> KDB: stack backtrace:
>>>>> db_trace_self() at db_trace_self
>>>>>       pc = 0xc0667004  lr = 0xc0078630 (db_trace_self_wrapper+0x30)
>>>>>       sp = 0xc6c9a9c8  fp = 0xc6c9aae0
>>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>>>>>       pc = 0xc0078630  lr = 0xc0328db8 (vpanic+0x140)
>>>>>       sp = 0xc6c9aae8  fp = 0xc6c9ab08
>>>>>       r4 = 0x00000100  r5 = 0x00000000
>>>>>       r6 = 0xc084d1f1  r7 = 0xc0b69a94
>>>>> vpanic() at vpanic+0x140
>>>>>       pc = 0xc0328db8  lr = 0xc0328c78 (vpanic)
>>>>>       sp = 0xc6c9ab10  fp = 0xc6c9ab14
>>>>>       r4 = 0xc6c9ac10  r5 = 0x00000013
>>>>>       r6 = 0xdb23209b  r7 = 0x00000001
>>>>>       r8 = 0x00000801  r9 = 0x00000013
>>>>>      r10 = 0xdb23209b
>>>>> vpanic() at vpanic
>>>>>       pc = 0xc0328c78  lr = 0xc068c8e8 (abort_align)
>>>>>       sp = 0xc6c9ab1c  fp = 0xc6c9ab48
>>>>>       r4 = 0x00000001  r5 = 0x00000801
>>>>>       r6 = 0x00000013  r7 = 0xdb23209b
>>>>>       r8 = 0xc6c9ab14  r9 = 0xc0328c78
>>>>>      r10 = 0xc6c9ab1c
>>>>> abort_align() at abort_align
>>>>>       pc = 0xc068c8e8  lr = 0xc068c958 (abort_align+0x70)
>>>>>       sp = 0xc6c9ab50  fp = 0xc6c9ab68
>>>>>       r4 = 0xc6d21c00 r10 = 0xdb23209b
>>>>> abort_align() at abort_align+0x70
>>>>>       pc = 0xc068c958  lr = 0xc068c5e0 (abort_handler+0x430)
>>>>>       sp = 0xc6c9ab70  fp = 0xc6c9ac08
>>>>>       r4 = 0x00000000 r10 = 0xdb23209b
>>>>> abort_handler() at abort_handler+0x430
>>>>>       pc = 0xc068c5e0  lr = 0xc0669868 (exception_exit)
>>>>>       sp = 0xc6c9ac10  fp = 0xc6c9acb8
>>>>>       r4 = 0xdb19e280  r5 = 0x00000000
>>>>>       r6 = 0x00000001  r7 = 0x00000006
>>>>>       r8 = 0xc6c9ad20  r9 = 0xc0b7973c
>>>>>      r10 = 0xc092074c
>>>>> exception_exit() at exception_exit
>>>>>       pc = 0xc0669868  lr = 0xc01b01d8 (usb_msc_auto_quirk+0xfc)
>>>>>       sp = 0xc6c9aca0  fp = 0xc6c9acb8
>>>>>       r0 = 0xdb232080  r1 = 0x00000000
>>>>>       r2 = 0x00000006  r3 = 0x00000024
>>>>>       r4 = 0xdb19e280  r5 = 0x00000000
>>>>>       r6 = 0x00000001  r7 = 0x00000006
>>>>>       r8 = 0xc6c9ad20  r9 = 0xc0b7973c
>>>>>      r10 = 0xc092074c r12 = 0x00000000
>>>>> bbb_command_start() at bbb_command_start+0x4c
>>>>>       pc = 0xc01aff88  lr = 0xc01b01d8 (usb_msc_auto_quirk+0xfc)
>>>>>       sp = 0xc6c9acc0  fp = 0xc6c9acf8
>>>>>       r4 = 0xdb16d800  r5 = 0xdb19e280
>>>>>       r6 = 0x00000001 r10 = 0xc092074c
>>>>> usb_msc_auto_quirk() at usb_msc_auto_quirk+0xfc
>>>>>       pc = 0xc01b01d8  lr = 0xc01a4bd8 (usb_alloc_device+0x9c4)
>>>>>       sp = 0xc6c9ad00  fp = 0xc6c9ad68
>>>>>       r4 = 0x00000000  r5 = 0x00000001
>>>>>       r6 = 0x00000000  r7 = 0x00000002
>>>>>       r8 = 0xdb16d800  r9 = 0xda241c78
>>>>>      r10 = 0x000003ee
>>>>> usb_alloc_device() at usb_alloc_device+0x9c4
>>>>>       pc = 0xc01a4bd8  lr = 0xc01ad16c (uhub_explore+0x494)
>>>>>       sp = 0xc6c9ad70  fp = 0xc6c9adc0
>>>>>       r4 = 0x00000000  r5 = 0x00000000
>>>>>       r6 = 0xdb16e800  r7 = 0x00000000
>>>>>       r8 = 0xdb18c200  r9 = 0x00000001
>>>>>      r10 = 0x00000000
>>>>> uhub_explore() at uhub_explore+0x494
>>>>>       pc = 0xc01ad16c  lr = 0xc0198654 (usb_bus_explore+0x1d4)
>>>>>       sp = 0xc6c9adc8  fp = 0xc6c9add8
>>>>>       r4 = 0xda241c78  r5 = 0xdb16e800
>>>>>       r6 = 0x00000000  r7 = 0xda241d6c
>>>>>       r8 = 0xc09b0b5f  r9 = 0x00000001
>>>>>      r10 = 0xda241d1c
>>>>> usb_bus_explore() at usb_bus_explore+0x1d4
>>>>>       pc = 0xc0198654  lr = 0xc01b22d0 (usb_process+0x124)
>>>>>       sp = 0xc6c9ade0  fp = 0xc6c9ae10
>>>>>       r4 = 0xda241d0c  r5 = 0xda241d14
>>>>> usb_process() at usb_process+0x124
>>>>>       pc = 0xc01b22d0  lr = 0xc02da4f0 (fork_exit+0xb0)
>>>>>       sp = 0xc6c9ae18  fp = 0xc6c9ae38
>>>>>       r4 = 0xc6c9ae40  r5 = 0xc6d21c00
>>>>>       r6 = 0xc6d08740  r7 = 0xda241d0c
>>>>>       r8 = 0xc01b21ac  r9 = 0x00000000
>>>>>      r10 = 0x00000000
>>>>> fork_exit() at fork_exit+0xb0
>>>>>       pc = 0xc02da4f0  lr = 0xc06697fc (swi_exit)
>>>>>       sp = 0xc6c9ae40  fp = 0x00000000
>>>>>       r4 = 0xc01b21ac  r5 = 0xda241d0c
>>>>>       r6 = 0x00000000  r7 = 0x00000000
>>>>>       r8 = 0x00000000 r10 = 0x00000000
>>>>> swi_exit() at swi_exit
>>>>>       pc = 0xc06697fc  lr = 0xc06697fc (swi_exit)
>>>>>       sp = 0xc6c9ae40  fp = 0x00000000
>>>>> KDB: enter: panic
>>>>> [ thread pid 14 tid 100069 ]
>>>>> Stopped at      kdb_enter+0x54: ldrb    r15, [r15, r15, ror r15]!
>>>>> db>
>>>> 
>>>> Using just available official artifact kernels for testing
>>>> I've established that 0953460ce149 (and various from before
>>>> that) does not have the problem:
>>>> 
>>>> Wed, 23 Oct 2024
>>>>   • git: 5c92f84bb607 - main - LinuxKPI: update rcu_dereference_*() and lockdep_is_held() Bjoern A. Zeeb
>>>>   • git: 6fa91acca40d - main - conf/NOTES: Remove trailing whitespace Li-Wen Hsu
>>>>   • git: 91b7b225b2ce - main - LINT: Add mac_do Li-Wen Hsu
>>>>   • git: 419249c1cacc - main - Revert "LINT: Add mac_do" Li-Wen Hsu
>>>>   • Re: git: 419249c1cacc - main - Revert "LINT: Add mac_do" Baptiste Daroussin
>>>>   • Re: git: 13da1af1cd67 - main - libcxxrt: Update to upstream 698997bfde1f John Baldwin
>>>>   • Re: git: 419249c1cacc - main - Revert "LINT: Add mac_do" John Baldwin
>>>>   • git: 0953460ce149 - main - libc: fix access mode tests in fmemopen(3) Ed Maste
>>>> 
>>>> So the above one worked.
>>>> 
>>>> The next available kernel to test was f3dbef108212 (the bump for LLVM19
>>>> at the end of the below):
>>>> 
>>>>   • RE: git: 6a07e67fb7a8 - main - vm_meter: Fix laundry accounting Mark Millard
>>>>   • git: 6b9f7133aba4 - main - libc: Add one more check in new fmemopen test Ed Maste
>>>>   • git: 0fca6ea1d4ee - main - Merge llvm-project main llvmorg-19-init-18630-gf2ccf80136a0 Dimitry Andric
>>>>   • git: 36b606ae6aa4 - main - Merge llvm-project release/19.x llvmorg-19.1.0-rc1-0-ga4902a36d5c2 Dimitry Andric
>>>>   • git: 3f157662c0ef - main - Tentatively apply https://github.com/llvm/llvm-project/pull/101403 Dimitry Andric
>>>>   • git: d575077527d4 - main - bsd.sys.mk: for clang >= 19, similar to gcc >= 8.1, turn off -Werror for -Wcast-function-type-mismatch. Dimitry Andric
>>>>   • git: 36d486cc2ecd - main - Fix enum warning in ath_hal's ar9002 Dimitry Andric
>>>>   • git: 6846ab2fb663 - main - libcxx simd_utils.h: only enable _LIBCPP_HAS_ALGORITHM_VECTOR_UTILS for clang >= 15, since older versions do not support the required builtins. Dimitry Andric
>>>>   • git: 81e300df5e65 - main - libcxx atomic_ref.h: add typename keyword for difference_type declarations, otherwise older clang versions cannot compile this header. Dimitry Andric
>>>>   • git: 6b4981df6008 - main - libcxx cstdlib, cwchar: avoid using long long functions if not supported, even for older compilers that do not support the using_if_exists attribute. Dimitry Andric
>>>>   • git: 2f6d6eaf2d51 - main - libcxx-compat: revert llvmorg-19-init-18063-g561246e90282: Dimitry Andric
>>>>   • git: 04f5b79cfa49 - main - libcxx-compat: revert llvmorg-19-init-18062-g4dfa75c663e5: Dimitry Andric
>>>>   • git: e8054e44f4ca - main - libcxx-compat: revert llvmorg-19-init-17853-g578c6191eff7: Dimitry Andric
>>>>   • git: 0bec0529b1d7 - main - libcxx-compat: revert llvmorg-19-init-17728-g30cc12cd818d: Dimitry Andric
>>>>   • git: e8847079df1b - main - libcxx-compat: revert llvmorg-19-init-17727-g0eebb48fcfbc: Dimitry Andric
>>>>   • git: 2f2ebe758bea - main - libcxx-compat: revert llvmorg-19-init-17473-g69fecaa1a455: Dimitry Andric
>>>>   • git: 1199d38d8ec7 - main - libcxx-compat: revert llvmorg-19-init-8667-g472b612ccbed: Dimitry Andric
>>>>   • git: a7b2d7f261b8 - main - libcxx-compat: revert llvmorg-19-init-5639-ga10aa4485e83: Dimitry Andric
>>>>   • git: f3859a1a13a1 - main - libcxx-compat: revert llvmorg-19-init-4504-g937a5396cf3e: Dimitry Andric
>>>>   • git: 072b5fb698ab - main - libcxx-compat: revert llvmorg-19-init-4003-g55357160d0e1: Dimitry Andric
>>>>   • git: b60301d8b594 - main - libcxx-compat: don't remove headers that were reintroduced by reverts Dimitry Andric
>>>>   • git: 2e861daab905 - main - libcxx-compat: install headers that were reintroduced by reverts Dimitry Andric
>>>>   • git: ff6c8447844b - main - libcxx-compat: update libcxx.imp for headers that were reintroduced by reverts Dimitry Andric
>>>>   • git: 52418fc2be8e - main - Merge llvm-project release/19.x llvmorg-19.1.0-rc2-0-gd033ae172d1c Dimitry Andric
>>>>   • git: 62987288060f - main - Merge llvm-project release/19.x llvmorg-19.1.0-rc3-0-g437434df21d8 Dimitry Andric
>>>>   • git: 6c4b055cfb6b - main - Merge llvm-project release/19.x llvmorg-19.1.0-rc4-0-g0c641568515a Dimitry Andric
>>>>   • git: 835c3a3e69af - main - Merge commit 6dbdb8430b49 from llvm git (by Nikolas Klauser): Dimitry Andric
>>>>   • git: c80e69b00d97 - main - Merge llvm-project release/19.x llvmorg-19.1.0-0-ga4bf6cd7cfb1 Dimitry Andric
>>>>   • git: 6e516c87b6d7 - main - Merge llvm-project release/19.x llvmorg-19.1.1-0-gd401987fe349 Dimitry Andric
>>>>   • git: 5deeebd8c6ca - main - Merge llvm-project release/19.x llvmorg-19.1.2-0-g7ba7d8e2f7b6 Dimitry Andric
>>>>   • git: f3dbef108212 - main - Bump __FreeBSD_version for llvm 19.1.2 merge Dimitry Andric
>>>> 
>>>> f3dbef108212 gets the:
>>>> 
>>>> "Fatal kernel mode data abort: 'Alignment Fault' on write"
>>>> 
>>>> boot failure for artifact kernel. 6b9f7133aba4 does nit
>>>> seem a likely source of the problem, basically leaving the
>>>> LLVM changes as what is at issue.
>>>> 
>>>> I'll note that artifact kernels are witness kernels. So
>>>> this exploration adds to the distinctions observed
>>>> compared to the prior notes.
>>>> 
>>>>> Looking at bbb_command_start() 's pc:
>>>>> 
>>>>> # llvm-addr2line -e /boot/kernel.GENERIC-NODEBUG/kernel 0xc01aff88
>>>>> /home/pkgbuild/worktrees/main/sys/dev/usb/usb_msctest.c:554
>>>>> 
>>>>> What leads to that line is:
>>>>> 
>>>>> /*------------------------------------------------------------------------*
>>>>> *      bbb_command_start - execute a SCSI command synchronously
>>>>> *
>>>>> * Return values
>>>>> * 0: Success
>>>>> * Else: Failure
>>>>> *------------------------------------------------------------------------*/
>>>>> static int
>>>>> bbb_command_start(struct bbb_transfer *sc, uint8_t dir, uint8_t lun,
>>>>>  void *data_ptr, size_t data_len, void *cmd_ptr, size_t cmd_len,
>>>>>  usb_timeout_t data_timeout)
>>>>> {
>>>>>      sc->lun = lun;
>>>>>      sc->dir = data_len ? dir : DIR_NONE;
>>>>>      sc->data_ptr = data_ptr;
>>>>>      sc->data_len = data_len;
>>>>>      sc->data_rem = data_len;
>>>>>      sc->data_timeout = (data_timeout + USB_MS_HZ);
>>>>>      sc->actlen = 0;
>>>>>      sc->error = 0;
>>>>>      sc->cmd_len = cmd_len;
>>>>>      memset(&sc->cbw->CBWCDB, 0, sizeof(sc->cbw->CBWCDB));
>>>>> 
>>>>> The memset line is line 554 of sys/dev/usb/usb_msctest.c .
>>>> 
>>>> The below looks to be a separate problem based on
>>>> some later FreeBSD kernel update than the above.
>>>> 
>>>>> I'll note that attempting to use the WITNESS variant of the kernel
>>>>> ( /boot/kernel/ ) gets a different, even earlier failure:
>>>>> 
>>>>> . . .
>>>>> VT: init without driver.
>>>>> panic: acquiring blockable sleep lock with spinlock or critical section held (sleep mutex) pmap @ /home/pkgbuild/worktrees/main/sys/arm/arm/pmap-v6.c:6455
>>>> 
>>>> I do know that d021d3b3c675 at the end of the below
>>>> shows this failure --before the system has a chance
>>>> to get the usb related write alignment failure
>>>> reported above.
>>>> 
>>>> I have not explored where in the below range the
>>>> behavior changes (for what is available as an
>>>> official artifact kernel). It seems unlikely that
>>>> any of the below would actually boot: it is likely
>>>> a question of which of the 2 (or more) failure
>>>> types happen for each instead.
>>> The last before "Thu, 24, Oct 2024" was:
>>>        • git: 8b2e7da70855 - main - llvm19: permit incremental builds from llvm18 Brooks Davis
>>> That is the last available artifact kernel that gets the
>>> original usb related write alignment type of failure.
>>>> Thu, 24 Oct 2024
>>>>   • git: 34951b0b9e78 - main - swap_pager: move scan_all_shadowed, use iterators Doug Moore
>>>>   • git: 2ac21f2c98ed - main - x86 specialreg.h: visually align %cr4 and MSR_EFER bit mask definitions Konstantin Belousov
>>>>   • git: cc11bc1150d5 - main - x86 specialreg.h: add all defined bits for %cr4 Konstantin Belousov
>>>>   • git: cc4b25f10211 - main - x86 specialreg: reorder %cr3 bits masks definitions by value Konstantin Belousov
>>>>   • git: 5999b74e9637 - main - x86 specialreg: add bit masks definitions for LAM in %cr3 Konstantin Belousov
>>>>   • git: 6308db659f2a - main - x86 specialreg: add bit masks definitions for EFER features Konstantin Belousov
>>>>   • git: 9f718b57b846 - main - x86 specialreg: add bit masks definitions for LASS and LAM features Konstantin Belousov
>>>>   • git: 3360a15898ce - main - net: route: convert routing statistics to a sysctl Kyle Evans
>>>>   • Re: git: 3360a15898ce - main - net: route: convert routing statistics to a sysctl Kyle Evans
>>>>   • git: 77b70ad751df - main - e1000: Move I219 LM19/V19 to ADL Kevin Bowling
>>> The last above is the first available artifact kernel that
>>> that gets the different error. There are no armv7 artifact
>>> kernels between 8b2e7da70855 and 77b70ad751df .
>>> So something from 34951b0b9e78 .. 77b70ad751df leads to
>>> the change in the type of failure. I've no clue what.
>>> It looked to me like the x86 commits and e1000 commit had
>>> no chance of contributing to the armv7 context. Thus
>>> who I added to the CC vs. did not add.
>>>>   • git: d64442a89896 - main - arm{,64}: use genassym for INTR_ROOT_* values Kyle Evans
>>>>   • git: 536c8d948e85 - main - intrng: change multi-interrupt root support type to enum Kyle Evans
>>>>   • git: 4f12b529f404 - main - sys/intr.h: formally depend on machine/intr.h Kyle Evans
>>>>   • git: a5b1eecbed07 - main - Apply workaround for building llvm-project with WITHOUT_LLVM_ASSERTIONS Dimitry Andric
>>>>   • git: 1c83996beda7 - main - Adjust LLVM_ENABLE_ABI_BREAKING_CHECKS depending on NDEBUG Dimitry Andric
>>>>   • git: b2dd4970c7b5 - main - dev/gpio: Mask all pl011 interrupts Andrew Turner
>>>>   • git: 3b03e1bb8615 - main - intrng: Store the IPI priority Andrew Turner
>>>>   • git: 6204391e99ca - main - arm64: Check TDP_NOFAULTING in a data abort Andrew Turner
>>>>   • git: a84653c5db25 - main - arm64: Don't enable interrupts when in a spinlock Andrew Turner
>>>>   • git: d7f930b80e89 - main - arm64: Implement efi_rt_arch_call Andrew Turner
>>>>   • git: 8efb1500d4f1 - main - arm64: Enable handling EFI runtime service faults Andrew Turner
>>>>   • git: 9693241188aa - main - sound: Call DSP_REGISTERED before PCM_DETACHING Christos Margiolis
>>>>   • git: bb5e3ac1a7b7 - main - sound: Use DSP_REGISTERED in dsp_clone() Christos Margiolis
>>>>   • git: a4111e9dc722 - main - sound: Change PCMDIR_* numbering Christos Margiolis
>>>>   • git: 802c78f5194e - main - sound: Untangle dsp_cdevs[] and dsp_unit2name() confusion Christos Margiolis
>>>>   • git: b1bb6934bb87 - main - sound: Fix build error in chm_mkname() KASSERT Christos Margiolis
>>>>   • git: ce20b48a60fb - main - sctp: improve debug output Michael Tuexen
>>>>   • git: e4ac0183a1a8 - main - sctp: cleanup Michael Tuexen
>>>>   • git: 8c8ebbb04518 - main - bhyve ahci: Improve robustness of TRIM handling John Baldwin
>>>>   • git: f0bc751d6fb4 - main - csa: Use pci_find_device to simplify clkrun_hack John Baldwin
>>>>   • git: d96ba5a62365 - main - config: Remove a stray semicolon Zhenlei Huang
>>>>   • git: 56b17de1e836 - main - makefs: Remove a stray semicolon Zhenlei Huang
>>>>   • git: 88b71d1fe054 - main - arm64: rockchip: Remove a stray semicolon Zhenlei Huang
>>>>   • git: b4856b8e9d87 - main - LinuxKPI: Remove stray semicolons Zhenlei Huang
>>>>   • git: 75ff90814aec - main - enic: Remove a stray semicolon Zhenlei Huang
>>>>   • git: 6ccf4f4071c5 - main - mana: Remove stray semicolons Zhenlei Huang
>>>>   • git: 86a2c910c05c - main - mpi3mr: Remove a stray semicolon Zhenlei Huang
>>>>   • git: 36756195a342 - main - ocs_fc: Remove a stray semicolon Zhenlei Huang
>>>>   • git: 2f395cfda8b5 - main - tcp cc: Remove a stray semicolon Zhenlei Huang
>>>>   • git: f3a097d0312c - main - netstat: switch to using the sysctl-exported stats for live stats Kyle Evans
>>>>   • git: 656991b0c629 - main - locks: augment lock_class with lc_trylock method Gleb Smirnoff
>>>>   • git: efcb2ec8cb81 - main - callout: provide CALLOUT_TRYLOCK flag Gleb Smirnoff
>>>>   • git: bffebc336f4e - main - tcp: use CALLOUT_TRYLOCK for the TCP callout Gleb Smirnoff
>>>>   • git: d021d3b3c675 - main - tcp: get rid of TDP_INTCPCALLOUT Gleb Smirnoff
>>>>> cpuid = 0
>>>>> time = 1
>>>>> KDB: stack backtrace:
>>>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
>>>>> trapframe: 0xc0f14568
>>>>> FSR=00000005, FAR=db7fcfb1, spsr=200001d3
>>>>> r0 =c0f1465c, r1 =00000001, r2 =db7fcfae, r3 =1b000a4e
>>>>> r4 =c07fc55c, r5 =8fce1b89, r6 =00006f3e, r7 =81000000
>>>>> r8 =c07c4b6c, r9 =c094ace8, r10=c09741d8, r11=c0f14618
>>>>> r12=c0f146c4, ssp=c0f145fc, slr=c0601428, pc =c062686c
>>>>> 
>>>>> panic: Fatal abort
>>>>> cpuid = 0
>>>>> time = 1
>>>>> KDB: stack backtrace:
>>>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
>>>>> trapframe: 0xc0f141f0
>>>>> FSR=00000005, FAR=db7fcfb1, spsr=200001d3
>>>>> r0 =c0f142e4, r1 =00000001, r2 =db7fcfae, r3 =1b000a4e
>>>>> r4 =c07fc55c, r5 =8fce1b89, r6 =00006f3e, r7 =81000000
>>>>> r8 =c07c4b6c, r9 =c094ace8, r10=c09741d8, r11=c0f142a0
>>>>> r12=c0f1434c, ssp=c0f14284, slr=c0601428, pc =c062686c
>>>>> 
>>>>> panic: Fatal abort
>>>>> cpuid = 0
>>>>> time = 1
>>>>> KDB: stack backtrace:
>>>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
>>>>> trapframe: 0xc0f13e78
>>>>> FSR=00000005, FAR=db7fcfb1, spsr=200001d3
>>>>> r0 =c0f13f6c, r1 =00000001, r2 =db7fcfae, r3 =1b000a4e
>>>>> r4 =c07fc55c, r5 =8fce1b89, r6 =00006f3e, r7 =81000000
>>>>> r8 =c07c4b6c, r9 =c094ace8, r10=c09741d8, r11=c0f13f28
>>>>> r12=c0f13fd4, ssp=c0f13f0c, slr=c0601428, pc =c062686c
>>>>> 
>>>>> panic: Fatal abort
>>>>> cpuid = 0
>>>>> time = 1
>>>>> KDB: stack backtrace:
>>>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
>>>>> trapframe: 0xc0f13b00
>>>>> FSR=00000005, FAR=db7fcfb1, spsr=200001d3
>>>>> r0 =c0f13bf4, r1 =00000001, r2 =db7fcfae, r3 =1b000a4e
>>>>> r4 =c07fc55c, r5 =8fce1b89, r6 =00006f3e, r7 =81000000
>>>>> r8 =c07c4b6c, r9 =c094ace8, r10=c09741d8, r11=c0f13bb0
>>>>> r12=c0f13c5c, ssp=c0f13b94, slr=c0601428, pc =c062686c
>>>>> 
>>>>> panic: Fatal abort
>>>>> cpuid = 0
>>>>> time = 1
>>>>> KDB: stack backtrace:
>>>>> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
>>>>> trapframe: 0xc0f13788
>>>>> FSR=00000005, FAR=db7fcfb1, spsr=200001d3
>>>>> r0 =c0f1387c, r1 =00000001, r2 =db7fcfae, r3 =1b000a4e
>>>>> r4 =c07fc55c, r5 =8fce1b89, r6 =00006f3e, r7 =81000000
>>>>> r8 =c07c4b6c, r9 =c094ace8, r10=c09741d8, r11=c0f13838
>>>>> r12=c0f138e4, ssp=c0f1381c, slr=c0601428, pc =c062686c
>>>>> 
>>>>> . . .
>>>>> 
>>>>> Looking:
>>>>> 
>>>>> # llvm-addr2line -e /boot/kernel.GENERIC-NODEBUG/kernel 0xc062686c
>>>>> /home/pkgbuild/worktrees/main/sys/vm/uma_core.c:5676
>>>>> 
>>>>> static int
>>>>> sysctl_handle_uma_zone_frees(SYSCTL_HANDLER_ARGS)
>>>>> {
>>>>>      uma_zone_t zone = arg1;
>>>>>      uint64_t cur;
>>>>> 
>>>>>      cur = uma_zone_get_frees(zone);
>>>>>      return (sysctl_handle_64(oidp, &cur, 0, req));
>>>>> }
>>>>> 
>>>>> The "return" line is 5676 of sys/vm/uma_core.c .
>>>>> 
>>>>> 
>>>>> Also, for what leads up to:
>>>>> 
>>>>> /home/pkgbuild/worktrees/main/sys/arm/arm/pmap-v6.c:6455
>>>>> 
>>>>> /*
>>>>> *  The implementation of pmap_fault() uses IN_RANGE2() macro which
>>>>> *  depends on the fact that given range size is a power of 2.
>>>>> */
>>>>> CTASSERT(powerof2(NB_IN_PT1));
>>>>> CTASSERT(powerof2(PT2MAP_SIZE));
>>>>> 
>>>>> #define IN_RANGE2(addr, start, size)    \
>>>>>  ((vm_offset_t)(start) == ((vm_offset_t)(addr) & ~((size) - 1)))
>>>>> 
>>>>> /*
>>>>> *  Handle access and R/W emulation faults.
>>>>> */
>>>>> int
>>>>> pmap_fault(pmap_t pmap, vm_offset_t far, uint32_t fsr, int idx, bool usermode)
>>>>> {
>>>>>      pt1_entry_t *pte1p, pte1;
>>>>>      pt2_entry_t *pte2p, pte2;
>>>>> 
>>>>>      if (pmap == NULL)
>>>>>              pmap = kernel_pmap;
>>>>> 
>>>>>      /*
>>>>>       * In kernel, we should never get abort with FAR which is in range of
>>>>>       * pmap->pm_pt1 or PT2MAP address spaces. If it happens, stop here
>>>>>       * and print out a useful abort message and even get to the debugger
>>>>>       * otherwise it likely ends with never ending loop of aborts.
>>>>>       */
>>>>>      if (__predict_false(IN_RANGE2(far, pmap->pm_pt1, NB_IN_PT1))) {
>>>>>              /*
>>>>>               * All L1 tables should always be mapped and present.
>>>>>               * However, we check only current one herein. For user mode,
>>>>>               * only permission abort from malicious user is not fatal.
>>>>>               * And alignment abort as it may have higher priority.
>>>>>               */
>>>>>              if (!usermode || (idx != FAULT_ALIGN && idx != FAULT_PERM_L2)) {
>>>>>                      CTR4(KTR_PMAP, "%s: pmap %#x pm_pt1 %#x far %#x",
>>>>>                          __func__, pmap, pmap->pm_pt1, far);
>>>>>                      panic("%s: pm_pt1 abort", __func__);
>>>>>              }
>>>>>              return (KERN_INVALID_ADDRESS);
>>>>>      }
>>>>>      if (__predict_false(IN_RANGE2(far, PT2MAP, PT2MAP_SIZE))) {
>>>>>              /*
>>>>>               * PT2MAP should be always mapped and present in current
>>>>>               * L1 table. However, only existing L2 tables are mapped
>>>>>               * in PT2MAP. For user mode, only L2 translation abort and
>>>>>               * permission abort from malicious user is not fatal.
>>>>>               * And alignment abort as it may have higher priority.
>>>>>               */
>>>>>              if (!usermode || (idx != FAULT_ALIGN &&
>>>>>                  idx != FAULT_TRAN_L2 && idx != FAULT_PERM_L2)) {
>>>>>                      CTR4(KTR_PMAP, "%s: pmap %#x PT2MAP %#x far %#x",
>>>>>                          __func__, pmap, PT2MAP, far);
>>>>>                      panic("%s: PT2MAP abort", __func__);
>>>>>              }
>>>>>              return (KERN_INVALID_ADDRESS);
>>>>>      }
>>>>> 
>>>>>      /*
>>>>>       * A pmap lock is used below for handling of access and R/W emulation
>>>>>       * aborts. They were handled by atomic operations before so some
>>>>>       * analysis of new situation is needed to answer the following question:
>>>>>       * Is it safe to use the lock even for these aborts?
>>>>>       *
>>>>>       * There may happen two cases in general:
>>>>>       *
>>>>>       * (1) Aborts while the pmap lock is locked already - this should not
>>>>>       * happen as pmap lock is not recursive. However, under pmap lock only
>>>>>       * internal kernel data should be accessed and such data should be
>>>>>       * mapped with A bit set and NM bit cleared. If double abort happens,
>>>>>       * then a mapping of data which has caused it must be fixed. Further,
>>>>>       * all new mappings are always made with A bit set and the bit can be
>>>>>       * cleared only on managed mappings.
>>>>>       *
>>>>>       * (2) Aborts while another lock(s) is/are locked - this already can
>>>>>       * happen. However, there is no difference here if it's either access or
>>>>>       * R/W emulation abort, or if it's some other abort.
>>>>>       */
>>>>> 
>>>>>      PMAP_LOCK(pmap);
>>>>> 
>>>>> That "PMAP_LOCK(pmap);" line is line 6455 of sys/arm/arm/pmap-v6.c .
>>>>> 
>>>>> 
>>>>> FYI: Running the prior kernel.GENERIC-NODEBUG/ ( called
>>>>> kernel.GENERIC-NODEBUG.good/ ) continues to operate
>>>>> normally. I do not have the older PkgBase kernel/ around
>>>>> to try, unfortunately.
>>> I'll remind that this is from using official FreeBSD builds
>>> of the kernel versions that I tested, not from my personal
>>> build context.
>>> ===
>>> Mark Millard
>>> marklmi at yahoo.com
>> Hi Mark,
>> 
>> Please see https://reviews.freebsd.org/D47485
>> 
>> Unfortunately, I see 2 problems with llvm 19.
>> 
>> The first is regression, the compiler generates inline memset() accessing non-aligned data with sub-optimal instructions (with word access). This regression triggers bug in the kernel (which should be fixed in D47485).
>> 
>> Second, regarding "panic: acquiring blockable sleep lock" is due to an bug in  lld.  It mis-links the ".ARM.exidx" section on the output binary, which is used by the stack unwinder in the kernel.
>> I don't have a fix for this for now, so you have to use the linker from llvm18 as a workaround.
>> 
>> I'm not sure if I have enough free cycles to manage both issues on the llvm side...
> 
> 




===
Mark Millard
marklmi at yahoo.com