Re: https://github.com/pftf/RPi3 UEFI/ACPI based booting: gic_acpi_identify crashes dereferencing a NULL pointer value

From: Mark Millard <marklmi_at_yahoo.com>
Date: Wed, 06 Apr 2022 23:37:45 UTC

On 2022-Apr-6, at 01:01, Mark Millard <marklmi@yahoo.com> wrote:

> https://github.com/pftf/RPi3 UEFI/ACPI use gets a boot crash in
> FreeBSD's gic_acpi_identify:
> 
> . . .
> MAP 1d0000 mode 2 pages 32
> MAP 339d0000 mode 2 pages 80
> MAP 33a20000 mode 2 pages 256
> MAP 37000000 mode 2 pages 400
> MAP 37190000 mode 2 pages 592
> kbd0 at kbdmux0
> acpi0: <BC2836 RPI3>
> acpi0: Power Button (fixed)
> acpi0: Could not update all GPEs: AE_NOT_CONFIGURED
> psci0: <ARM Power State Co-ordination Interface Driver> on acpi0
> Fatal data abort:
>  x0: ffff000086ffe6b4 (crypto_dev + 858f044c)
>  x1: ffff00000103d0d0 (initstack + 30d0)
>  x2: ffff00000080ed2c (madt_handler + 0)
>  x3: ffff00000103d0d0 (initstack + 30d0)
>  x4:         d2d9fffc
>  x5:                0
>  x6: ffffffffffffffff
>  x7:             2001
>  x8:                0
>  x9:              400
> x10:              800
> x11:                0
> x12: ffff00000103d8dc (initstack + 38dc)
> x13:               b6
> x14:              551
> x15:              16c
> x16:                0
> x17:                1
> x18: ffff00000103d0d0 (initstack + 30d0)
> x19: ffff000086ffe598 (crypto_dev + 858f0330)
> x20: ffffa00000dba200
> x21: ffff00000103d0e0 (initstack + 30e0)
> x22: ffffa00000c37a40
> x23: ffff000000ec8000 (devsoftc + 88)
> x24: ffff00000097fe1a (digits + 102f6)
> x25:          3800000
> x26: ffff000000e74000 (gdb_tx_u + a98)
> x27: ffff000000e74000 (gdb_tx_u + a98)
> x28: ffff00004042bd28 (crypto_dev + 3ed1dac0)
> x29: ffff00000103d8e0 (initstack + 38e0)
>  sp: ffff00000103d0d0
>  lr: ffff00000080e908 (gic_acpi_identify + 7c)
> elr: ffff00000080e90c (gic_acpi_identify + 80)
> spsr:         600000c5
> far:               14
> esr:         96000004
> panic: vm_fault failed: ffff00000080e90c error 1
> cpuid = 0
> time = 1
> KDB: stack backtrace:
> db_trace_self() at db_trace_self
> db_trace_self_wrapper() at db_trace_self_wrapper+0x30
> vpanic() at vpanic+0x178
> panic() at panic+0x44
> data_abort() at data_abort+0x2bc
> handle_el1h_sync() at handle_el1h_sync+0x10
> --- exception, esr 0x96000004
> gic_acpi_identify() at gic_acpi_identify+0x80
> bus_generic_new_pass() at bus_generic_new_pass+0x44
> bus_generic_new_pass() at bus_generic_new_pass+0xb0
> bus_generic_new_pass() at bus_generic_new_pass+0xb0
> root_bus_configure() at root_bus_configure+0x40
> mi_startup() at mi_startup+0x224
> virtdone() at virtdone+0x7c
> KDB: enter: panic
> [ thread pid 0 tid 100000 ]
> Stopped at      kdb_enter+0x48: undefined       f901c11f
> 
> This turns out to have gic_acpi_identify+0x80
> with the code shown below:
> 
> ffff00000080e904 <gic_acpi_identify+0x78> bl    ffff00000011d640 <acpi_walk_subtables>
> ffff00000080e908 <gic_acpi_identify+0x7c> ldr   x8, [sp, #8]
> ffff00000080e90c <gic_acpi_identify+0x80> ldrb  w8, [x8, #20]
> 
> and the register dump above shows:
> 
>  x8:                0
> 
> Looking up the source ( sys/arm/arm/gic_acpi.c ) there is
> the likes of:
> 
> struct madt_table_data {
>        device_t parent;
>        ACPI_MADT_GENERIC_DISTRIBUTOR *dist;
>        ACPI_MADT_GENERIC_INTERRUPT *intr[MAXCPU];
> };
> . . .
>        bzero(&madt_data, sizeof(madt_data));
>        madt_data.parent = parent;
>        madt_data.dist = NULL;
> 
>        acpi_walk_subtables(madt + 1, (char *)madt + madt->Header.Length,
>            madt_handler, &madt_data);
> 
>        /* Check the version of the GIC we have */
>        switch (madt_data.dist->Version) {
> 
> So it appears that madt_data.dist held a NULL pointer value
> that was not checked for. (I've no clue if such a NULL is
> supposed to be possible --but I do know it occured.)
> 
> The following lines are:
> 
>        case ACPI_MADT_GIC_VERSION_NONE:
>        case ACPI_MADT_GIC_VERSION_V1:
>        case ACPI_MADT_GIC_VERSION_V2:
>                break;
>        default:
>                goto out;
>        }
> . . .
> out:
>        acpi_unmap_table(madt);
> }
> 
> That might suggest that madt_data.dist==NULL should lead to
> a "goto out".
> 
> 

QUOTE ( https://en.wikipedia.org/wiki/Raspberry_Pi )
The Raspberry Pi 4 uses a Broadcom BCM2711 SoC . . . Unlike previous models, which all used a custom interrupt controller poorly suited for virtualisation, the interrupt controller on this SoC is compatible with the ARM Generic Interrupt Controller (GIC) architecture 2.0, providing hardware support for interrupt distribution when using ARM virtualisation capabilities.
END QUOTE

So it looks like the madt_data.dist==NULL corresponds to
there not being a GIC of any version present to find in
MADT for an RPi3.

I'll note that the value ACPI_MADT_GIC_VERSION_NONE is listed in
ACPI_Spec_6_4_Jan22.pdf as:

QUOTE
0x00: No GIC version is specified, fall back to hardware discovery for GIC version
END QUOTE

I'd take that wording as still presuming the presence of a GIC
to do version discovery with. If true, the madt_data.dist==NULL
for the RPi3B may well be the way to indicate that no GIC is
present to even do hardware version discovery with.


FreeBSD may well not want to support https://github.com/pftf/RPi3
use in ACPI mode. But it still may want to avoid a non-obvious
crash as the means of rejecting the ACPI it is provided.

===
Mark Millard
marklmi at yahoo.com