Re: Armv7 panic on -current, rpi2 buildworld

From: Mark Millard <marklmi_at_yahoo.com>
Date: Thu, 16 Feb 2023 19:35:45 UTC
On Feb 14, 2023, at 23:16, Mark Millard <marklmi@yahoo.com> wrote:

> On Feb 14, 2023, at 20:16, Warner Losh <imp@bsdimp.com> wrote:
> 
>> Sorry to top post... what program was dumping core? Looks like a too strict assert
> 
> Just a possible point, given recent kernel floating
> point work:
> 
> Because of Bob's note, I tried to do a typical build
> and test of some benchmark programs that I sometimes
> use that involve floating point in some of the
> programs, some use with multithreading involved. (As
> FreeBSD and g++ progress I tend to do this once and
> a while, not as often on armv7 as on aarch64.)
> 
> On armv7, I now get a message about a failure of an
> internal cross-check, which also leads to the program
> being stopped early. The messaging from run to run
> varies what the failure is, but the runs should not
> vary and should not fail the cross-checks --and
> previously did not, including when I last tried armv7.
> (Not recently.)
> 
> For the specific example failure, the initial serial
> (single thread) test with float involved works but the
> following multi-thread test in the same program fails
> and causes the program to stop when it notices there
> is a problem.
> 
> The programs that do not test floating point do not
> fail. These can involve floating point outside the
> algorithm benchmarked, but with no multi-threading
> involved for such and no floating point based cross-
> checks involved.
> 
> At this point it is far from obvious to me how I
> would trackdown the specifics of what leads to the
> failed cross-checks. But the above is suggestive of
> there being problems for armv7 handling of saving
> and restoring floating point context for
> multi-threading. I've no clue if such are limited
> to the floating point values or not.
> 
>> Warner
>> 
>> On Tue, Feb 14, 2023, 7:57 PM bob prohaska <fbsd@www.zefox.net> wrote:
>> Building world on an RPi2 armv7, buildworld stopped with
>> bob@www:/usr/src % panic: Called fill_fpregs while the kernel is using the VFP
>> cpuid = 0
>> time = 1676427410
>> KDB: stack backtrace:
>> db_trace_self() at db_trace_self
>>         pc = 0xc05e8160  lr = 0xc007aa04 (db_trace_self_wrapper+0x30)
>>         sp = 0xde2c5790  fp = 0xde2c58a8
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>>         pc = 0xc007aa04  lr = 0xc02e9c54 (vpanic+0x140)
>>         sp = 0xde2c58b0  fp = 0xde2c58d0
>>         r4 = 0x00000100  r5 = 0x00000000
>>         r6 = 0xc07372ef  r7 = 0xc0b13968
>> vpanic() at vpanic+0x140
>>         pc = 0xc02e9c54  lr = 0xc02e9a34 (dump_savectx)
>>         sp = 0xde2c58d8  fp = 0xde2c58dc
>>         r4 = 0xd70c8600  r5 = 0xde2c5e90
>>         r6 = 0xc3398090  r7 = 0xe0cfc440
>>         r8 = 0xc3398080  r9 = 0xd70c8600
>>        r10 = 0xde2c5960
>> dump_savectx() at dump_savectx
>>         pc = 0xc02e9a34  lr = 0xc05f51dc (set_regs)
>>         sp = 0xde2c58e4  fp = 0xde2c58f8
>> set_regs() at set_regs
>>         pc = 0xc05f51dc  lr = 0xc026f8f0 (elf32_get_fpregset+0x2c)
>>         sp = 0xde2c5900  fp = 0xde2c5908
>>         r4 = 0xc3398090  r5 = 0xc026f8c4
>> elf32_get_fpregset() at elf32_get_fpregset+0x2c
>>         pc = 0xc026f8f0  lr = 0xc026d848 (elf32_coredump+0x308)
>>         sp = 0xde2c5910  fp = 0xde2c5988
>>         r4 = 0xc0902a7c r10 = 0xde2c5960
>> elf32_coredump() at elf32_coredump+0x308
>>         pc = 0xc026d848  lr = 0xc02eea74 (sigexit+0xce0)
>>         sp = 0xde2c5990  fp = 0xde2c5cf8
>>         r4 = 0x0000004e  r5 = 0xdf580b60
>>         r6 = 0xdf580a78  r7 = 0xc026d540
>>         r8 = 0xdddcb2bc  r9 = 0xdf580ad4
>>        r10 = 0x00000000
>> sigexit() at sigexit+0xce0
>>         pc = 0xc02eea74  lr = 0xc02ef36c (postsig+0x128)
>>         sp = 0xde2c5d00  fp = 0xde2c5d88
>>         r4 = 0x00000006  r5 = 0xdd43fba0
>>         r6 = 0xde2c5d20  r7 = 0xde2c5d18
>>         r8 = 0xdddcb1f8  r9 = 0xdf3d9ab8
>>        r10 = 0x00000005
>> postsig() at postsig+0x128
>>         pc = 0xc02ef36c  lr = 0xc02f316c (ast_sig+0x11c)
>>         sp = 0xde2c5d90  fp = 0xde2c5e08
>>         r4 = 0xdd43fba0  r5 = 0xdddcb2bc
>>         r6 = 0xc0734d22  r7 = 0x00000000
>>         r8 = 0xdddcb1f8  r9 = 0x00000ab8
>>        r10 = 0x22530384
>> ast_sig() at ast_sig+0x11c
>>         pc = 0xc02f316c  lr = 0xc035444c (ast_handler+0xe0)
>>         sp = 0xde2c5e10  fp = 0xde2c5e28
>>         r4 = 0xde2c5e40  r5 = 0x0000000e
>>         r6 = 0x00004000  r7 = 0xc096b59c
>>         r8 = 0xdd43fba0  r9 = 0x00000001
>> ast_handler() at ast_handler+0xe0
>>         pc = 0xc035444c  lr = 0xc035435c (ast+0x20)
>>         sp = 0xde2c5e30  fp = 0xde2c5e38
>>         r4 = 0xde2c5e40  r5 = 0xdd43fba0
>>         r6 = 0x00000000  r7 = 0x000001b1
>>         r8 = 0x22c4b500  r9 = 0x00000000
>> ast() at ast+0x20
>>         pc = 0xc035435c  lr = 0xc05eaa88 (swi_exit+0x3c)
>>         sp = 0xde2c5e40  fp = 0xbb9fbe38
>>         r4 = 0x60000013  r5 = 0xdd43fba0
>> swi_exit() at swi_exit+0x3c
>>         pc = 0xc05eaa88  lr = 0xc05eaa88 (swi_exit+0x3c)
>>         sp = 0xde2c5e40  fp = 0xbb9fbe38
>> KDB: enter: panic
>> [ thread pid 81621 tid 101111 ]
>> Stopped at      kdb_enter+0x54: ldrb    r15, [r15, r15, ror r15]!
>> db> bt
>> Tracing pid 81621 tid 101111 td 0xdd43fba0
>> db_trace_self() at db_trace_self
>>         pc = 0xc05e8160  lr = 0xc00774a0 (db_stack_trace+0x140)
>>         sp = 0xde2c55d8  fp = 0xde2c55f0
>> db_stack_trace() at db_stack_trace+0x140
>>         pc = 0xc00774a0  lr = 0xc00770f0 (db_command+0x310)
>>         sp = 0xde2c55f8  fp = 0xde2c56a0
>>         r4 = 0xc0745722  r5 = 0x00000062
>>         r6 = 0x00000000 r10 = 0x00000000
>> db_command() at db_command+0x310
>>         pc = 0xc00770f0  lr = 0xc0076db8 (db_command_loop+0x64)
>>         sp = 0xde2c56a8  fp = 0xde2c56b8
>>         r4 = 0xc07ac186  r5 = 0xc07ab7fe
>>         r6 = 0xc0986f5c  r7 = 0xc0b13968
>>         r8 = 0xc0b23738  r9 = 0x00000000
>>        r10 = 0x00000001
>> db_command_loop() at db_command_loop+0x64
>>         pc = 0xc0076db8  lr = 0xc007ab88 (db_trap+0x128)
>>         sp = 0xde2c56c0  fp = 0xde2c57d8
>>         r4 = 0x00000000  r5 = 0xc0986f50
>>         r6 = 0xc0b23758 r10 = 0x00000001
>> db_trap() at db_trap+0x128
>>         pc = 0xc007ab88  lr = 0xc033bb84 (kdb_trap+0x258)
>>         sp = 0xde2c57e0  fp = 0xde2c5808
>>         r4 = 0xc078390c  r5 = 0xc08d5270
>>         r6 = 0xc0b23758  r7 = 0xc0b13968
>> kdb_trap() at kdb_trap+0x258
>>         pc = 0xc033bb84  lr = 0xc05eaab8 (exception_exit)
>>         sp = 0xde2c5810  fp = 0xde2c58a8
>>         r4 = 0x200000d3  r5 = 0x00000000
>>         r6 = 0xc07372ef  r7 = 0xc0b13968
>>         r8 = 0xc093fa0c  r9 = 0xde2c58e4
>>        r10 = 0xc0b13a68
>> exception_exit() at exception_exit
>>         pc = 0xc05eaab8  lr = 0xc033b044 (kdb_enter+0x50)
>>         sp = 0xde2c58a0  fp = 0xde2c58a8
>>         r0 = 0x00000000  r1 = 0x00000001
>>         r2 = 0x00000012  r3 = 0x00000000
>>         r4 = 0xc0b23748  r5 = 0x00000000
>>         r6 = 0xc07372ef  r7 = 0xc0b13968
>>         r8 = 0xc093fa0c  r9 = 0xde2c58e4
>>        r10 = 0xc0b13a68 r12 = 0x00000000
>> kdb_enter() at kdb_enter+0x58
>>         pc = 0xc033b04c  lr = 0xc02e9ca0 (vpanic+0x18c)
>>         sp = 0xde2c58b0  fp = 0xde2c58d0
>>         r4 = 0x00000100 r10 = 0xc0b13a68
>> vpanic() at vpanic+0x18c
>>         pc = 0xc02e9ca0  lr = 0xc02e9a34 (dump_savectx)
>>         sp = 0xde2c58d8  fp = 0xde2c58dc
>>         r4 = 0xd70c8600  r5 = 0xde2c5e90
>>         r6 = 0xc3398090  r7 = 0xe0cfc440
>>         r8 = 0xc3398080  r9 = 0xd70c8600
>>        r10 = 0xde2c5960
>> dump_savectx() at dump_savectx
>>         pc = 0xc02e9a34  lr = 0xc05f51dc (set_regs)
>>         sp = 0xde2c58e4  fp = 0xde2c58f8
>> set_regs() at set_regs
>>         pc = 0xc05f51dc  lr = 0xc026f8f0 (elf32_get_fpregset+0x2c)
>>         sp = 0xde2c5900  fp = 0xde2c5908
>>         r4 = 0xc3398090  r5 = 0xc026f8c4
>> elf32_get_fpregset() at elf32_get_fpregset+0x2c
>>         pc = 0xc026f8f0  lr = 0xc026d848 (elf32_coredump+0x308)
>>         sp = 0xde2c5910  fp = 0xde2c5988
>>         r4 = 0xc0902a7c r10 = 0xde2c5960
>> elf32_coredump() at elf32_coredump+0x308
>>         pc = 0xc026d848  lr = 0xc02eea74 (sigexit+0xce0)
>>         sp = 0xde2c5990  fp = 0xde2c5cf8
>>         r4 = 0x0000004e  r5 = 0xdf580b60
>>         r6 = 0xdf580a78  r7 = 0xc026d540
>>         r8 = 0xdddcb2bc  r9 = 0xdf580ad4
>>        r10 = 0x00000000
>> sigexit() at sigexit+0xce0
>>         pc = 0xc02eea74  lr = 0xc02ef36c (postsig+0x128)
>>         sp = 0xde2c5d00  fp = 0xde2c5d88
>>         r4 = 0x00000006  r5 = 0xdd43fba0
>>         r6 = 0xde2c5d20  r7 = 0xde2c5d18
>>         r8 = 0xdddcb1f8  r9 = 0xdf3d9ab8
>>        r10 = 0x00000005
>> postsig() at postsig+0x128
>>         pc = 0xc02ef36c  lr = 0xc02f316c (ast_sig+0x11c)
>>         sp = 0xde2c5d90  fp = 0xde2c5e08
>>         r4 = 0xdd43fba0  r5 = 0xdddcb2bc
>>         r6 = 0xc0734d22  r7 = 0x00000000
>>         r8 = 0xdddcb1f8  r9 = 0x00000ab8
>>        r10 = 0x22530384
>> ast_sig() at ast_sig+0x11c
>>         pc = 0xc02f316c  lr = 0xc035444c (ast_handler+0xe0)
>>         sp = 0xde2c5e10  fp = 0xde2c5e28
>>         r4 = 0xde2c5e40  r5 = 0x0000000e
>>         r6 = 0x00004000  r7 = 0xc096b59c
>>         r8 = 0xdd43fba0  r9 = 0x00000001
>> ast_handler() at ast_handler+0xe0
>>         pc = 0xc035444c  lr = 0xc035435c (ast+0x20)
>>         sp = 0xde2c5e30  fp = 0xde2c5e38
>>         r4 = 0xde2c5e40  r5 = 0xdd43fba0
>>         r6 = 0x00000000  r7 = 0x000001b1
>>         r8 = 0x22c4b500  r9 = 0x00000000
>> ast() at ast+0x20
>>         pc = 0xc035435c  lr = 0xc05eaa88 (swi_exit+0x3c)
>>         sp = 0xde2c5e40  fp = 0xbb9fbe38
>>         r4 = 0x60000013  r5 = 0xdd43fba0
>> swi_exit() at swi_exit+0x3c
>>         pc = 0xc05eaa88  lr = 0xc05eaa88 (swi_exit+0x3c)
>>         sp = 0xde2c5e40  fp = 0xbb9fbe38
>> db> 
>> 
>> The machine was last updated about a week ago, the
>> sources were updated earlier today. This panic is
>> new to me.
> 

I now have a small C++ program that, when aborted
by SIGABRT on armv7 (say via control-\), gets the
above type of FreeBSD crash while trying to produce
the *.core file (debug style armv7 kernel in use).

I've sent the authors of the recent
VFP-use-in-armv7-kernel changes the details, also:
Warner L. .

I previously sent them a small C program that gets a
KASSERT based panic for a debug armv7 kernel when
run under gdb or lldb with a breakpoint at a
specific routine.

In general, looks like armv7 floating point use is now
problematical on main's [so: 14's] armv7 kernel until
more work is done.

===
Mark Millard
marklmi at yahoo.com