Re: FYI: RPi* firmware tagged 1.20210805 appears to be the last to be bootable by FreeBSD via fdt use; sequence of 2 failure modes after that
- Reply: Mark Millard : "Re: FYI: RPi* firmware tagged 1.20210805 *is* the last to be bootable by FreeBSD via fdt use; sequence of 2 failure modes after that"
- In reply to: Mark Millard : "Re: FYI: RPi* firmware tagged 1.20210805 appears to be the last to be bootable by FreeBSD via fdt use; sequence of 2 failure modes after that"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 28 Apr 2022 06:47:37 UTC
[Just an FYI: I got ahold of the RPi3B and discovered that it was not bootable via RPi* firmware tagged 1.20210805 . In fact it barely produced any output on the serial console: very early failure. Reverting to the prior one, 1.20210727, worked for the RPi3B and the RPi4B.] [I've not added to the below and have removed the long text block of RPi4B boot failure output.] On 2022-Apr-24, at 05:36, Mark Millard <marklmi@yahoo.com> wrote: > [I may have also found what leads to the extra messages for > the 2nd failure mode, an independent issue it turns out.] > > On 2022-Apr-24, at 04:37, Mark Millard <marklmi@yahoo.com> wrote: > >> [I think I found the reason for the boot crash that is >> a common failure to both failure modes. The 2nd mode >> has other issues I've not analyzed.] >> >> On 2022-Apr-23, at 23:45, Mark Millard <marklmi@yahoo.com> wrote: >> >>> The following is based on a microsd card with 13.1-RC4 on >>> it were I'd previously substituted my U-Boot 2022.04 build >>> and tested with the RPi* firmware that is in the 13.1-RC4 >>> image. Here I've tried replacing the RPi* firmware and >>> holding the rest constant. >>> >>> The boot tests are on a 8 GiByte RPi4B Rev 1.14 with the >>> B0T stepping. I've not been copying over the linux kernels, >>> which they also bundle with the firmware. >>> >>> [13.1-RC4 is just what I happened to use. I doubt anything >>> here is special to 13.* or stable/13 or main [so: 14]. >>> (I do not use 12.* or stable/12.)] >>> >>> The observed status went like . . . >>> >>> >>> firmware-1.20210805/boot/ >>> >>> The RPi* release tagged 1.20210805 is the last version that >>> FreeBSD booted with. (Other than booting, logging in, and >>> shutting down, I've not been testing other aspects of >>> operation.) >>> >>> From what I've read, firmware-1.20210805/boot/ should be >>> recent enough to handle the Rev 1.15 related PMIC variation. >>> >>> [I'll note that firmware build dates need not be the same day >>> as the date encoded into the tag --in fact it is usually some >>> earlier day. On rare occasion it can be a lot earlier, and >>> there is an example of that below.] >>> >>> >>> After firmware-1.20210805 there are 2 major failure modes. >>> Both stop at the same sort of point in the messaging --but >>> there is a huge difference in the count of earlier error >>> messages. It looks to me like all the issues require >>> FreeBSD changes if modern RPi* firmware/dtb's are to be >>> usable via fdt. >> >> I've noticed a difference between the working context and >> the failing ones (both failure modes). >> >> Failing: >> >> spi0: <BCM2708/2835 SPI controller> mem 0x7e204000-0x7e2041ff irq 18 on simplebus0 >> spibus0: <OFW SPI bus> on spi0 >> spibus0: <unknown card> at cs 0 mode 0 >> spibus0: <unknown card> at cs 1 mode 0 >> NOTE BELOW LINES MISSING HERE. >> sdhci_bcm0: <Broadcom 2708 SDHCI controller> mem 0x7e300000-0x7e3000ff irq 24 on simplebus0 >> >> Working: >> >> spi0: <BCM2708/2835 SPI controller> mem 0x7e204000-0x7e2041ff irq 18 on simplebus0 >> spibus0: <OFW SPI bus> on spi0 >> spibus0: <unknown card> at cs 0 mode 0 >> spibus0: <unknown card> at cs 1 mode 0 >> START LINES MISSING ABOVE >> iichb0: <BCM2708/2835 BSC controller> mem 0x7e804000-0x7e804fff irq 26 on simplebus0 >> bcm_dma0: <BCM2835 DMA Controller> mem 0x7e007000-0x7e007aff irq 30,31,32,33,34,35,36,37,38,39,40 on simplebus0 >> bcmwd0: <BCM2708/2835 Watchdog> mem 0x7e100000-0x7e100113,0x7e00a000-0x7e00a023,0x7ec11000-0x7ec1101f on simplebus0 >> bcmrng0: <Broadcom BCM2835/BCM2838 RNG> mem 0x7e104000-0x7e104027 on simplebus0 >> gpioc1: <GPIO controller> on gpio1 >> END LINES MISSING ABOVE >> sdhci_bcm0: <Broadcom 2708 SDHCI controller> mem 0x7e300000-0x7e3000ff irq 73 on simplebus0 >> >> In particular: >> >> bcm_dma0: <BCM2835 DMA Controller> mem 0x7e007000-0x7e007aff irq 30,31,32,33,34,35,36,37,38,39,40 on simplebus0 >> >> being missing means no bcm_dma_attach and that in turn means >> that the static bcm_dma_sc == NULL still. >> >> The panic was: panic: vm_fault failed: ffff000000862134 >> >> where: >> >> ffff000000862134 <bcm_dma_allocate+0x88> ldaxr x1, [x9] >> >> which is part of: >> >> int >> bcm_dma_allocate(int req_ch) >> { >> struct bcm_dma_softc *sc = bcm_dma_sc; >> int ch = BCM_DMA_CH_INVALID; >> int i; >> >> if (req_ch >= BCM_DMA_CH_MAX) >> return (BCM_DMA_CH_INVALID); >> >> /* Auto(req_ch < 0) or CH specified */ >> mtx_lock(&sc->sc_mtx); >> . . . >> >> So the likes of &sc->sc_mtx end up being a small offset >> from address zero: >> >> x9: 20 >> >> Thus the panic. >> >> As to how bcm_dma_allocate happened without bcm_dma_attach >> happening first . . . >> >> The working context's dtb has the ordering: >> (I also show mmcnr@ and the brcm,bcm2711-dma >> just for reference.) >> >> dma@7e007000 { >> compatible = "brcm,bcm2835-dma"; >> . . . >> mmc@7e300000 { >> compatible = "brcm,bcm2835-mmc", "brcm,bcm2835-sdhci"; >> . . . >> mmcnr@7e300000 { >> compatible = "brcm,bcm2835-mmc", "brcm,bcm2835-sdhci"; >> . . . >> dma@7e007b00 { >> compatible = "brcm,bcm2711-dma"; >> >> But the failing context's dtb has the ordering: >> (I also show mmcnr@ and the brcm,bcm2711-dma >> just for reference.) >> >> mmc@7e300000 { >> compatible = "brcm,bcm2835-mmc", "brcm,bcm2835-sdhci"; >> . . . >> dma@7e007000 { >> compatible = "brcm,bcm2835-dma"; >> . . . >> mmcnr@7e300000 { >> compatible = "brcm,bcm2835-mmc", "brcm,bcm2835-sdhci"; >> . . . >> dma@7e007b00 { >> compatible = "brcm,bcm2711-dma"; >> >> So, for sequential handling in the failing case, the dma@7e007000 >> would use bcm_dma_allocate before the bcm_dma_probe/bcm_dma_attach >> sequence had happened, leading to the crash. >> >> Note: I used "fdt print /" from U-Boot to get the dtb and its >> ordering. This was based on the address that the RPi* firmware >> reports when debugging output is enabled (0x4000 here). >> >> >>> The 1st mode happens for (I've added the -fails notation): >>> >>> firmware-1.20210831-fails/boot/ >>> firmware-1.20210928-fails/boot/ >>> firmware-1.20211007-fails/boot/ >>> firmware-1.20211029-fails/boot/ >>> firmware-1.20211118-fails/boot/ >>> firmware-1.20220308_buster-fails/boot/ >>> (The _buster one has firmware from 2021-Dec-01, which >>> is before all the tagged releases listed below. >>> It looks like the switch to the new major kernel >>> version after buster came with other changes that >>> FreeBSD has not tracked.) >>> >>> >>> The 2nd mode happens for the following. (Again with extra >>> notation.) There are a lot more error messages before the >>> panic happens for these. The firmware builds for these >>> are more recent than for the above list. >>> >>> >>> firmware-1.20220118-fails/boot/ >>> >>> firmware-1.20220120-fails/boot/ >>> firmware-1.20220308-fails-non-kernels-same-as-1.20220120/boot/ >>> (I did not repeat the testing of the unchanged firmware. >>> I just did the "diff -r" to discover the lack of change.) >>> >>> firmware-1.20220328-fails/boot/ >>> firmware-1.20220331-fails-non-kernels-same-as-firmware-1.20220328-but-for-bcm2711-dtb-files/boot/ >>> (Since the .dtb for the RPi4B was different, I did test this.) > > It looks like the extra messages, blocks of: > > clk_fixed4: <Fixed clock> disabled on ofwbus0 > clk_fixed4: Cannot FDT parameters. > device_attach: clk_fixed4 attach returned 6 > > Are tied to new dtb content in 2022's dtb updates: > > cam1_clk { > compatible = "fixed-clock"; > #clock-cells = <0x00000000>; > status = "disabled"; > phandle = <0x000000e2>; > }; > . . . > cam0_clk { > compatible = "fixed-clock"; > #clock-cells = <0x00000000>; > status = "disabled"; > phandle = <0x000000e4>; > }; > > These 2 did not exist back when the 1st failure mode > started. They appear to be repeatedly processed from > not really being handled --leading to lots of > messages. > > The messages may just be noise for activity that is > not contributing to boot failures at all. So fixing > what I called the 1st failure mode might actually fix > booting for all the firmware versions after the > version tagged 1.20210805 . > >>> The failures look like (each test shown) . . . >>> >>> >>> . . . >> === Mark Millard marklmi at yahoo.com