Re: The import of openzfs vs. armv7: boot crashs

From: Mark Millard <marklmi_at_yahoo.com>
Date: Tue, 18 Apr 2023 23:04:35 UTC
On Apr 18, 2023, at 15:46, Warner Losh <imp@bsdimp.com> wrote:

> Fun...
> 
> I'm also fighting aarch64 issues...

Of what kind? I've been able to use things as committed
in FreeBSD (block_cloning never having been enabled but
jumping from before the import to, effectively, after
the FreeBSD adjustments). But I have not tried anything
that is different as committed in openzfs.

(I'm one of those that tested poudriere bulk activity
via separate media from my normal aarch64 context. Those
tests had no problems once the full set up adjustments
was present in my context.)

> Warner
> 
> On Tue, Apr 18, 2023, 4:45 PM Mark Millard <marklmi@yahoo.com> wrote:
> https://github.com/openzfs/zfs/commit/d0cbd9feaf5b82130f2e679256c71e0c7413aae9
> 
> does not seem to cover armv7, just aarch64. (FreeBSD disabled
> floating point for both armv7 and aarch64 but that is a
> different change than above.)

I probably should have explicitly noted that the fpu disabling
was from after the snapshot being tested here.

The point of the snapshot test (the most recent available) was
to find out if armv7 crashed before the fpu-use disabling commit.

> I used:
> 
> FreeBSD-14.0-CURRENT-arm-armv7-GENERICSD-20230406-f21faa67ab6b-262010.img.xz

That is from after the import and after:

    • git: eb1feadc201a - main - zfs: fix null ap->a_fsizetd NULL pointer derefernce Martin Matuska

but with no other zfs changes. It does not contain:

    • git: d6e24901349d - main - zfs: disable kernel fpu usage on arm and aarc64 Mateusz Guzik

(But the openzfs changes are different.)

> booted an RPi2B v1.1 and tried (note the KSTACK_PAGES notice and the
> "undefined floating point instruction" notice):
> 
> # zpool import
> ZFS NOTICE: KSTACK_PAGES is 2 which could result in stack overflow panic!
> Please consider adding 'options KSTACK_PAGES=4' to your kernel config
> panic: undefined floating point instruction in supervisor mode
> cpuid = 2
> time = 1680784610
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
>          pc = 0xc05eb154  lr = 0xc007a688 (db_trace_self_wrapper+0x30)
>          sp = 0xdd25c480  fp = 0xdd25c598
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
>          pc = 0xc007a688  lr = 0xc02eb1b4 (vpanic+0x140)
>          sp = 0xdd25c5a0  fp = 0xdd25c5c0
>          r4 = 0x00000100  r5 = 0x00000000
>          r6 = 0xc0736bfc  r7 = 0xc0b1aea8
> vpanic() at vpanic+0x140
>          pc = 0xc02eb1b4  lr = 0xc02eaf94 (doadump)
>          sp = 0xdd25c5c8  fp = 0xdd25c5cc
>          r4 = 0xc0b92210  r5 = 0x00000000
>          r6 = 0xc0610ca0  r7 = 0xf4210a0d
>          r8 = 0xddf32e4c  r9 = 0x00000013
>         r10 = 0xdd25c6c0
> doadump() at doadump
>          pc = 0xc02eaf94  lr = 0xc0610eb0 (vfp_new_thread)
>          sp = 0xdd25c5d4  fp = 0xdd25c638
>          r4 = 0xdd25c6c0  r5 = 0xdd25c5cc
>          r6 = 0xc02eaf94 r10 = 0xdd25c5d4
> vfp_new_thread() at vfp_new_thread
>          pc = 0xc0610eb0  lr = 0xc060ff84 (undefinedinstruction+0x178)
>          sp = 0xdd25c640  fp = 0xdd25c6b8
> undefinedinstruction() at undefinedinstruction+0x178
>          pc = 0xc060ff84  lr = 0xc05edaa8 (exception_exit)
>          sp = 0xdd25c6c0  fp = 0xdd25c750
>          r4 = 0x20000013  r5 = 0xde45e000
>          r6 = 0xdd25c890  r7 = 0xdd25c8b0
>          r8 = 0x00000000  r9 = 0x00000000
>         r10 = 0xdd25c8c0
> exception_exit() at exception_exit
>          pc = 0xc05edaa8  lr = 0xddf31f20 (K256)
>          sp = 0xdd25c750  fp = 0xdd25c750
>          r0 = 0xdd25c890  r1 = 0xde45e000
>          r2 = 0xde45e400  r3 = 0xddf309fc
>          r4 = 0x00000400  r5 = 0xde45e000
>          r6 = 0xdd25c890  r7 = 0xdd25c8b0
>          r8 = 0x00000000  r9 = 0x00000000
>         r10 = 0xdd25c8c0 r12 = 0xdd25c7a0
> zfs_sha256_block_neon() at zfs_sha256_block_neon+0x1c
>          pc = 0xddf32e4c  lr = 0xc0946e8c (pcpup)
>          sp = 0xdd25c758  fp = 0xc0b0aeec
>          r4 = 0xc0919610  r5 = 0xc0919630
>          r6 = 0xc0919618  r7 = 0x642ebce2
>          r8 = 0xc0b1b0ec  r9 = 0xc0915e88
>         r10 = 0xc0b1b0dc
> Fatal kernel mode data abort: 'Translation Fault (L1)' on read
> trapframe: 0xdd25c330
> FSR=00000005, FAR=95e29398, spsr=200000d3
> r0 =dd25c424, r1 =81000000, r2 =95e29395, r3 =55555555
> r4 =c08ae93c, r5 =00004aa0, r6 =00004aa0, r7 =c08d3e3c
> r8 =00000001, r9 =c079567a, r10=0000000b, r11=dd25c3e0
> r12=00000000, ssp=dd25c3c4, slr=00000001, pc =c0610308
> 
> panic: Fatal abort
> . . . (repeats over and over) . . .
> 




===
Mark Millard
marklmi at yahoo.com