Re: Recent commits reject RPi4B booting: pcib0 vs. pcib1 "rman_manage_region: <pcib1 memory window> request" leads to panic
Date: Wed, 14 Feb 2024 18:45:45 UTC
Hey John, On Wed, Feb 14, 2024 at 10:30 AM John Baldwin <jhb@freebsd.org> wrote: > On 2/14/24 8:42 AM, Warner Losh wrote: > > On Wed, Feb 14, 2024 at 9:08 AM John Baldwin <jhb@freebsd.org> wrote: > > > >> On 2/12/24 5:57 PM, Mark Millard wrote: > >>> On Feb 12, 2024, at 16:36, Mark Millard <marklmi@yahoo.com> wrote: > >>> > >>>> On Feb 12, 2024, at 16:10, Mark Millard <marklmi@yahoo.com> wrote: > >>>> > >>>>> On Feb 12, 2024, at 12:00, Mark Millard <marklmi@yahoo.com> wrote: > >>>>> > >>>>>> [Gack: I was looking at the wrong vintage of source code, predating > >>>>>> your changes: wrong system used.] > >>>>>> > >>>>>> > >>>>>> On Feb 12, 2024, at 10:41, Mark Millard <marklmi@yahoo.com> wrote: > >>>>>> > >>>>>>> On Feb 12, 2024, at 09:32, John Baldwin <jhb@freebsd.org> wrote: > >>>>>>> > >>>>>>>> On 2/9/24 8:13 PM, Mark Millard wrote: > >>>>>>>>> Summary: > >>>>>>>>> pcib0: <BCM2838-compatible PCI-express controller> mem > >> 0x7d500000-0x7d50930f irq 80,81 on simplebus2 > >>>>>>>>> pcib0: parsing FDT for ECAM0: > >>>>>>>>> pcib0: PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: > >> 0x40000000 > >>>>>>>>> . . . > >>>>>>>>> rman_manage_region: <pcib1 memory window> request: start > >> 0x600000000, end 0x6000fffff > >>>>>>>>> panic: Failed to add resource to rman > >>>>>>>> > >>>>>>>> Hmmm, I suspect this is due to the way that bus_translate_resource > >> works which is > >>>>>>>> fundamentally broken. It rewrites the start address of a resource > >> in-situ instead > >>>>>>>> of keeping downstream resources separate from the upstream > >> resources. For example, > >>>>>>>> I don't see how you could ever release a resource in this design > >> without completely > >>>>>>>> screwing up your rman. That is, I expect trying to detach a PCI > >> device behind a > >>>>>>>> translating bridge that uses the current approach should corrupt > >> the allocated > >>>>>>>> resource ranges in an rman long before my changes. > >>>>>>>> > >>>>>>>> That said, that doesn't really explain the panic. Hmm, the panic > >> might be because > >>>>>>>> for PCI bridge windows the driver now passes RF_ACTIVE and the > >> bus_translate_resource > >>>>>>>> hack only kicks in the activate_resource method of > >> pci_host_generic.c. > >>>>>>>> > >>>>>>>>> Detail: > >>>>>>>>> . . . > >>>>>>>>> pcib0: <BCM2838-compatible PCI-express controller> mem > >> 0x7d500000-0x7d50930f irq 80,81 on simplebus2 > >>>>>>>>> pcib0: parsing FDT for ECAM0: > >>>>>>>>> pcib0: PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: > >> 0x40000000 > >>>>>>>> > >>>>>>>> This indicates this is a translating bus. > >>>>>>>> > >>>>>>>>> pcib1: <PCI-PCI bridge> irq 91 at device 0.0 on pci0 > >>>>>>>>> rman_manage_region: <pcib1 bus numbers> request: start 0x1, end > 0x1 > >>>>>>>>> pcib0: rman_reserve_resource: start=0xc0000000, end=0xc00fffff, > >> count=0x100000 > >>>>>>>>> rman_reserve_resource_bound: <PCIe Memory> request: [0xc0000000, > >> 0xc00fffff], length 0x100000, flags 102, device pcib1 > >>>>>>>>> rman_reserve_resource_bound: trying 0xffffffff > <0xc0000000,0xfffff> > >>>>>>>>> considering [0xc0000000, 0xffffffff] > >>>>>>>>> truncated region: [0xc0000000, 0xc00fffff]; size 0x100000 > >> (requested 0x100000) > >>>>>>>>> candidate region: [0xc0000000, 0xc00fffff], size 0x100000 > >>>>>>>>> allocating from the beginning > >>>>>>>>> rman_manage_region: <pcib1 memory window> request: start > >> 0x600000000, end 0x6000fffff > >>>>> > >>>>> What you later typed does not match: > >>>>> > >>>>> 0x600000000 > >>>>> 0x6000fffff > >>>>> > >>>>> You later typed: > >>>>> > >>>>> 0x60000000 > >>>>> 0x600fffffff > >>>>> > >>>>> This seems to have lead to some confusion from using the > >>>>> wrong figure(s). > >>>>> > >>>>>>>> The fact that we are trying to reserve the CPU addresses in the > >> rman is because > >>>>>>>> bus_translate_resource rewrote the start address in the resource > >> after it was allocated. > >>>>>>>> > >>>>>>>> That said, I can't see why rman_manage_region would actually fail. > >> At this point the > >>>>>>>> rman is empty (this is the first call to rman_manage_region for > >> "pcib1 memory window"), > >>>>>>>> so only the check that should be failing are the checks against > >> rm_start and > >>>>>>>> rm_end. For the memory window, rm_start is always 0, and rm_end > is > >> always > >>>>>>>> 0xffffffff, so both the old (0xc00000000 - 0xc00fffff) and new > >> (0x60000000 - 0x600fffffff) > >>>>>>>> ranges are within those bounds. > >>>>> > >>>>> No: > >>>>> > >>>>> 0xffffffff > >>>>> > >>>>> .vs (actual): > >>>>> > >>>>> 0x600000000 > >>>>> 0x6000fffff > >> > >> Ok, then this explains the failure if the "raw" addresses are above > 4G. I > >> have > >> access to an emag I'm currently using to test fixes to > pci_host_generic.c > >> to > >> avoid corrupting struct resource objects. I'll post the diff once I've > got > >> something verified to work. > >> > >>> It looks to me like in sys/dev/pci/pci_pci.c the: > >>> > >>> static void > >>> pcib_probe_windows(struct pcib_softc *sc) > >>> { > >>> . . . > >>> pcib_alloc_window(sc, &sc->mem, SYS_RES_MEMORY, 0, > 0xffffffff); > >>> . . . > >>> > >>> is just inappropriately restrictive about where in the system > >>> address space a PCIe can validly be mapped to on the high end. > >>> That, in turn, leads to the rejection on the RPi4B now that > >>> the range use is checked. > >> > >> No, the physical register in PCI-PCI bridges is only 32-bits. Only the > >> prefetchable BAR supports 64-bit addresses. This is why the host bridge > >> is doing a translation from the CPU side (0x600000000) to the PCI BAR > >> addresses (0xc0000000) so that the BAR addresses are down in the 32-bit > >> address range. It's also true that many PCI devices only support 32-bit > >> addresses in memory BARs. 64-bit BARs are an optional extension not > >> universally supported. > >> > >> The translation here is somewhat akin to a type of MMU where the CPU > >> addresses are mapped to PCI addresses. The problem here is that the > >> PCI BAR resources need to "stay" as PCI addresses since we depend on > >> being able to use rman_get_start/end to get the PCI addresses of > >> allocated resources, but pci_host_generic.c currently rewrites the > >> addresses. > >> > >> Probably I should remove rman_set_start/end entirely (Warner added them > >> back in 2004) as the methods don't do anything to deal with the fallout > >> that the rman.rm_list linked-list is no longer sorted by address once > >> some addresses get rewritten, etc. > >> > > > > At the time, they made sense. Removing it, though may take some doing > > since we use it in about 284 places in sys/dev today... Somewhat more > > pervasive than I'd have thought they'd be... > > Eh, I only find the one that I'm now removing: > > % git grep -E 'rman_set_(start|end)' sys/ > sys/dev/pci/pci_host_generic.c: rman_set_start(r, start); > sys/dev/pci/pci_host_generic.c: rman_set_end(r, end); > sys/kern/subr_rman.c:rman_set_start(struct resource *r, rman_res_t start) > sys/kern/subr_rman.c:rman_set_end(struct resource *r, rman_res_t end) > sys/sys/rman.h:void rman_set_end(struct resource *_r, rman_res_t _end); > sys/sys/rman.h:void rman_set_start(struct resource *_r, rman_res_t > _start); > > Also, I managed to boot the emag I have access to this morning. I had to > fix a few other bugs in acpi(4) for my changes in pci_host_generic to work > but will post reviews later today. > That's what happens when you grep for 'get' instead of 'set' before the morning caffeine kicks in .. Warner