Re: Armv7 panic on -current, rpi2 buildworld

From: Kornel_Dulęba <kd_at_FreeBSD.org>
Date: Mon, 20 Feb 2023 13:15:41 UTC
> Can you try with 24abb6b82102eec577eff9bd8dd7726e8cab89f4? There were 
> conditional branch instructions that may mean the function to save the 
> VFP state was not being run.

I'm currently debugging this and applying 
24abb6b82102eec577eff9bd8dd7726e8cab89f4 didn't quite help.
(I've tested it with dbl_and_ull_via_async that Mark shared in another 
thread.)
The root cause is located in vfp_save_state. It's called during the 
dump, right before the assert is triggered:

359         /*
360          * savectx() will be called on panic with dumppcb as an 
argument,
361          * dumppcb doesn't have pcb_vfpsaved set, so set it to save
362          * the VFP registers.
363          */
364         if (pcb->pcb_vfpsaved == NULL)
365                 pcb->pcb_vfpsaved = &pcb->pcb_vfpstate;

Here pcb_vfpsaved == NULL, since the VFP has never been used in the kernel.
This triggers the KASSERT in get_vfpcontext, causing the panic.
Note that arm64 has very similar logic, so I wonder if a similar panic 
could be observed there.
Any thoughts?

>
> Andrew
>
>> On 16 Feb 2023, at 19:35, Mark Millard <marklmi@yahoo.com> wrote:
>>
>> On Feb 14, 2023, at 23:16, Mark Millard <marklmi@yahoo.com> wrote:
>>
>>> On Feb 14, 2023, at 20:16, Warner Losh <imp@bsdimp.com> wrote:
>>>
>>>> Sorry to top post... what program was dumping core? Looks like a 
>>>> too strict assert
>>>
>>> Just a possible point, given recent kernel floating
>>> point work:
>>>
>>> Because of Bob's note, I tried to do a typical build
>>> and test of some benchmark programs that I sometimes
>>> use that involve floating point in some of the
>>> programs, some use with multithreading involved. (As
>>> FreeBSD and g++ progress I tend to do this once and
>>> a while, not as often on armv7 as on aarch64.)
>>>
>>> On armv7, I now get a message about a failure of an
>>> internal cross-check, which also leads to the program
>>> being stopped early. The messaging from run to run
>>> varies what the failure is, but the runs should not
>>> vary and should not fail the cross-checks --and
>>> previously did not, including when I last tried armv7.
>>> (Not recently.)
>>>
>>> For the specific example failure, the initial serial
>>> (single thread) test with float involved works but the
>>> following multi-thread test in the same program fails
>>> and causes the program to stop when it notices there
>>> is a problem.
>>>
>>> The programs that do not test floating point do not
>>> fail. These can involve floating point outside the
>>> algorithm benchmarked, but with no multi-threading
>>> involved for such and no floating point based cross-
>>> checks involved.
>>>
>>> At this point it is far from obvious to me how I
>>> would trackdown the specifics of what leads to the
>>> failed cross-checks. But the above is suggestive of
>>> there being problems for armv7 handling of saving
>>> and restoring floating point context for
>>> multi-threading. I've no clue if such are limited
>>> to the floating point values or not.
>>>
>>>> Warner
>>>>
>>>> On Tue, Feb 14, 2023, 7:57 PM bob prohaska <fbsd@www.zefox.net> wrote:
>>>> Building world on an RPi2 armv7, buildworld stopped with
>>>> bob@www:/usr/src % panic: Called fill_fpregs while the kernel is 
>>>> using the VFP
>>>> cpuid = 0
>>>> time = 1676427410
>>>> KDB: stack backtrace:
>>>> db_trace_self() at db_trace_self
>>>>        pc = 0xc05e8160  lr = 0xc007aa04 (db_trace_self_wrapper+0x30)
>>>>        sp = 0xde2c5790  fp = 0xde2c58a8
>>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>>>>        pc = 0xc007aa04  lr = 0xc02e9c54 (vpanic+0x140)
>>>>        sp = 0xde2c58b0  fp = 0xde2c58d0
>>>>        r4 = 0x00000100  r5 = 0x00000000
>>>>        r6 = 0xc07372ef  r7 = 0xc0b13968
>>>> vpanic() at vpanic+0x140
>>>>        pc = 0xc02e9c54  lr = 0xc02e9a34 (dump_savectx)
>>>>        sp = 0xde2c58d8  fp = 0xde2c58dc
>>>>        r4 = 0xd70c8600  r5 = 0xde2c5e90
>>>>        r6 = 0xc3398090  r7 = 0xe0cfc440
>>>>        r8 = 0xc3398080  r9 = 0xd70c8600
>>>>       r10 = 0xde2c5960
>>>> dump_savectx() at dump_savectx
>>>>        pc = 0xc02e9a34  lr = 0xc05f51dc (set_regs)
>>>>        sp = 0xde2c58e4  fp = 0xde2c58f8
>>>> set_regs() at set_regs
>>>>        pc = 0xc05f51dc  lr = 0xc026f8f0 (elf32_get_fpregset+0x2c)
>>>>        sp = 0xde2c5900  fp = 0xde2c5908
>>>>        r4 = 0xc3398090  r5 = 0xc026f8c4
>>>> elf32_get_fpregset() at elf32_get_fpregset+0x2c
>>>>        pc = 0xc026f8f0  lr = 0xc026d848 (elf32_coredump+0x308)
>>>>        sp = 0xde2c5910  fp = 0xde2c5988
>>>>        r4 = 0xc0902a7c r10 = 0xde2c5960
>>>> elf32_coredump() at elf32_coredump+0x308
>>>>        pc = 0xc026d848  lr = 0xc02eea74 (sigexit+0xce0)
>>>>        sp = 0xde2c5990  fp = 0xde2c5cf8
>>>>        r4 = 0x0000004e  r5 = 0xdf580b60
>>>>        r6 = 0xdf580a78  r7 = 0xc026d540
>>>>        r8 = 0xdddcb2bc  r9 = 0xdf580ad4
>>>>       r10 = 0x00000000
>>>> sigexit() at sigexit+0xce0
>>>>        pc = 0xc02eea74  lr = 0xc02ef36c (postsig+0x128)
>>>>        sp = 0xde2c5d00  fp = 0xde2c5d88
>>>>        r4 = 0x00000006  r5 = 0xdd43fba0
>>>>        r6 = 0xde2c5d20  r7 = 0xde2c5d18
>>>>        r8 = 0xdddcb1f8  r9 = 0xdf3d9ab8
>>>>       r10 = 0x00000005
>>>> postsig() at postsig+0x128
>>>>        pc = 0xc02ef36c  lr = 0xc02f316c (ast_sig+0x11c)
>>>>        sp = 0xde2c5d90  fp = 0xde2c5e08
>>>>        r4 = 0xdd43fba0  r5 = 0xdddcb2bc
>>>>        r6 = 0xc0734d22  r7 = 0x00000000
>>>>        r8 = 0xdddcb1f8  r9 = 0x00000ab8
>>>>       r10 = 0x22530384
>>>> ast_sig() at ast_sig+0x11c
>>>>        pc = 0xc02f316c  lr = 0xc035444c (ast_handler+0xe0)
>>>>        sp = 0xde2c5e10  fp = 0xde2c5e28
>>>>        r4 = 0xde2c5e40  r5 = 0x0000000e
>>>>        r6 = 0x00004000  r7 = 0xc096b59c
>>>>        r8 = 0xdd43fba0  r9 = 0x00000001
>>>> ast_handler() at ast_handler+0xe0
>>>>        pc = 0xc035444c  lr = 0xc035435c (ast+0x20)
>>>>        sp = 0xde2c5e30  fp = 0xde2c5e38
>>>>        r4 = 0xde2c5e40  r5 = 0xdd43fba0
>>>>        r6 = 0x00000000  r7 = 0x000001b1
>>>>        r8 = 0x22c4b500  r9 = 0x00000000
>>>> ast() at ast+0x20
>>>>        pc = 0xc035435c  lr = 0xc05eaa88 (swi_exit+0x3c)
>>>>        sp = 0xde2c5e40  fp = 0xbb9fbe38
>>>>        r4 = 0x60000013  r5 = 0xdd43fba0
>>>> swi_exit() at swi_exit+0x3c
>>>>        pc = 0xc05eaa88  lr = 0xc05eaa88 (swi_exit+0x3c)
>>>>        sp = 0xde2c5e40  fp = 0xbb9fbe38
>>>> KDB: enter: panic
>>>> [ thread pid 81621 tid 101111 ]
>>>> Stopped at      kdb_enter+0x54: ldrb    r15, [r15, r15, ror r15]!
>>>> db> bt
>>>> Tracing pid 81621 tid 101111 td 0xdd43fba0
>>>> db_trace_self() at db_trace_self
>>>>        pc = 0xc05e8160  lr = 0xc00774a0 (db_stack_trace+0x140)
>>>>        sp = 0xde2c55d8  fp = 0xde2c55f0
>>>> db_stack_trace() at db_stack_trace+0x140
>>>>        pc = 0xc00774a0  lr = 0xc00770f0 (db_command+0x310)
>>>>        sp = 0xde2c55f8  fp = 0xde2c56a0
>>>>        r4 = 0xc0745722  r5 = 0x00000062
>>>>        r6 = 0x00000000 r10 = 0x00000000
>>>> db_command() at db_command+0x310
>>>>        pc = 0xc00770f0  lr = 0xc0076db8 (db_command_loop+0x64)
>>>>        sp = 0xde2c56a8  fp = 0xde2c56b8
>>>>        r4 = 0xc07ac186  r5 = 0xc07ab7fe
>>>>        r6 = 0xc0986f5c  r7 = 0xc0b13968
>>>>        r8 = 0xc0b23738  r9 = 0x00000000
>>>>       r10 = 0x00000001
>>>> db_command_loop() at db_command_loop+0x64
>>>>        pc = 0xc0076db8  lr = 0xc007ab88 (db_trap+0x128)
>>>>        sp = 0xde2c56c0  fp = 0xde2c57d8
>>>>        r4 = 0x00000000  r5 = 0xc0986f50
>>>>        r6 = 0xc0b23758 r10 = 0x00000001
>>>> db_trap() at db_trap+0x128
>>>>        pc = 0xc007ab88  lr = 0xc033bb84 (kdb_trap+0x258)
>>>>        sp = 0xde2c57e0  fp = 0xde2c5808
>>>>        r4 = 0xc078390c  r5 = 0xc08d5270
>>>>        r6 = 0xc0b23758  r7 = 0xc0b13968
>>>> kdb_trap() at kdb_trap+0x258
>>>>        pc = 0xc033bb84  lr = 0xc05eaab8 (exception_exit)
>>>>        sp = 0xde2c5810  fp = 0xde2c58a8
>>>>        r4 = 0x200000d3  r5 = 0x00000000
>>>>        r6 = 0xc07372ef  r7 = 0xc0b13968
>>>>        r8 = 0xc093fa0c  r9 = 0xde2c58e4
>>>>       r10 = 0xc0b13a68
>>>> exception_exit() at exception_exit
>>>>        pc = 0xc05eaab8  lr = 0xc033b044 (kdb_enter+0x50)
>>>>        sp = 0xde2c58a0  fp = 0xde2c58a8
>>>>        r0 = 0x00000000  r1 = 0x00000001
>>>>        r2 = 0x00000012  r3 = 0x00000000
>>>>        r4 = 0xc0b23748  r5 = 0x00000000
>>>>        r6 = 0xc07372ef  r7 = 0xc0b13968
>>>>        r8 = 0xc093fa0c  r9 = 0xde2c58e4
>>>>       r10 = 0xc0b13a68 r12 = 0x00000000
>>>> kdb_enter() at kdb_enter+0x58
>>>>        pc = 0xc033b04c  lr = 0xc02e9ca0 (vpanic+0x18c)
>>>>        sp = 0xde2c58b0  fp = 0xde2c58d0
>>>>        r4 = 0x00000100 r10 = 0xc0b13a68
>>>> vpanic() at vpanic+0x18c
>>>>        pc = 0xc02e9ca0  lr = 0xc02e9a34 (dump_savectx)
>>>>        sp = 0xde2c58d8  fp = 0xde2c58dc
>>>>        r4 = 0xd70c8600  r5 = 0xde2c5e90
>>>>        r6 = 0xc3398090  r7 = 0xe0cfc440
>>>>        r8 = 0xc3398080  r9 = 0xd70c8600
>>>>       r10 = 0xde2c5960
>>>> dump_savectx() at dump_savectx
>>>>        pc = 0xc02e9a34  lr = 0xc05f51dc (set_regs)
>>>>        sp = 0xde2c58e4  fp = 0xde2c58f8
>>>> set_regs() at set_regs
>>>>        pc = 0xc05f51dc  lr = 0xc026f8f0 (elf32_get_fpregset+0x2c)
>>>>        sp = 0xde2c5900  fp = 0xde2c5908
>>>>        r4 = 0xc3398090  r5 = 0xc026f8c4
>>>> elf32_get_fpregset() at elf32_get_fpregset+0x2c
>>>>        pc = 0xc026f8f0  lr = 0xc026d848 (elf32_coredump+0x308)
>>>>        sp = 0xde2c5910  fp = 0xde2c5988
>>>>        r4 = 0xc0902a7c r10 = 0xde2c5960
>>>> elf32_coredump() at elf32_coredump+0x308
>>>>        pc = 0xc026d848  lr = 0xc02eea74 (sigexit+0xce0)
>>>>        sp = 0xde2c5990  fp = 0xde2c5cf8
>>>>        r4 = 0x0000004e  r5 = 0xdf580b60
>>>>        r6 = 0xdf580a78  r7 = 0xc026d540
>>>>        r8 = 0xdddcb2bc  r9 = 0xdf580ad4
>>>>       r10 = 0x00000000
>>>> sigexit() at sigexit+0xce0
>>>>        pc = 0xc02eea74  lr = 0xc02ef36c (postsig+0x128)
>>>>        sp = 0xde2c5d00  fp = 0xde2c5d88
>>>>        r4 = 0x00000006  r5 = 0xdd43fba0
>>>>        r6 = 0xde2c5d20  r7 = 0xde2c5d18
>>>>        r8 = 0xdddcb1f8  r9 = 0xdf3d9ab8
>>>>       r10 = 0x00000005
>>>> postsig() at postsig+0x128
>>>>        pc = 0xc02ef36c  lr = 0xc02f316c (ast_sig+0x11c)
>>>>        sp = 0xde2c5d90  fp = 0xde2c5e08
>>>>        r4 = 0xdd43fba0  r5 = 0xdddcb2bc
>>>>        r6 = 0xc0734d22  r7 = 0x00000000
>>>>        r8 = 0xdddcb1f8  r9 = 0x00000ab8
>>>>       r10 = 0x22530384
>>>> ast_sig() at ast_sig+0x11c
>>>>        pc = 0xc02f316c  lr = 0xc035444c (ast_handler+0xe0)
>>>>        sp = 0xde2c5e10  fp = 0xde2c5e28
>>>>        r4 = 0xde2c5e40  r5 = 0x0000000e
>>>>        r6 = 0x00004000  r7 = 0xc096b59c
>>>>        r8 = 0xdd43fba0  r9 = 0x00000001
>>>> ast_handler() at ast_handler+0xe0
>>>>        pc = 0xc035444c  lr = 0xc035435c (ast+0x20)
>>>>        sp = 0xde2c5e30  fp = 0xde2c5e38
>>>>        r4 = 0xde2c5e40  r5 = 0xdd43fba0
>>>>        r6 = 0x00000000  r7 = 0x000001b1
>>>>        r8 = 0x22c4b500  r9 = 0x00000000
>>>> ast() at ast+0x20
>>>>        pc = 0xc035435c  lr = 0xc05eaa88 (swi_exit+0x3c)
>>>>        sp = 0xde2c5e40  fp = 0xbb9fbe38
>>>>        r4 = 0x60000013  r5 = 0xdd43fba0
>>>> swi_exit() at swi_exit+0x3c
>>>>        pc = 0xc05eaa88  lr = 0xc05eaa88 (swi_exit+0x3c)
>>>>        sp = 0xde2c5e40  fp = 0xbb9fbe38
>>>> db>
>>>>
>>>> The machine was last updated about a week ago, the
>>>> sources were updated earlier today. This panic is
>>>> new to me.
>>>
>>
>> I now have a small C++ program that, when aborted
>> by SIGABRT on armv7 (say via control-\), gets the
>> above type of FreeBSD crash while trying to produce
>> the *.core file (debug style armv7 kernel in use).
>>
>> I've sent the authors of the recent
>> VFP-use-in-armv7-kernel changes the details, also:
>> Warner L. .
>>
>> I previously sent them a small C program that gets a
>> KASSERT based panic for a debug armv7 kernel when
>> run under gdb or lldb with a breakpoint at a
>> specific routine.
>>
>> In general, looks like armv7 floating point use is now
>> problematical on main's [so: 14's] armv7 kernel until
>> more work is done.
>>
>> ===
>> Mark Millard
>> marklmi atyahoo.com <http://yahoo.com/>
>