Re: Pinebook pro IOMMU enabled crashes
Date: Thu, 30 Sep 2021 16:39:43 UTC
On 30.09.2021 18.28, Jesper Schmitz Mouridsen wrote: > On 23.09.2021 19.58, Jesper Schmitz Mouridsen wrote: >> Hi >> >> I just rebuild a generic arm64 with only this change: >> >> diff --git a/sys/arm64/conf/GENERIC b/sys/arm64/conf/GENERIC >> index c716183aae61..7a609db412ca 100644 >> --- a/sys/arm64/conf/GENERIC >> +++ b/sys/arm64/conf/GENERIC >> @@ -19,7 +19,7 @@ >> >> cpu ARM64 >> ident GENERIC >> - >> +options IOMMU >> include "std.arm64" >> include "std.dev" >> >> FreeBSD 14.0-CURRENT #6 main-n249584-fd69939e79a6-dirty >> >> It does not happen without the nvme attached. >> >> pcib0: <Rockchip PCIe controller> mem >> 0xf8000000-0xf9ffffff,0xfd000000-0xfdffffff irq 6,7,8 on ofwbus0 >> pci0: <OFW PCI bus> on pcib0 >> pcib1: <PCI-PCI bridge> at device 0.0 on pci0 >> pcib0: failed to reserve resource for pcib1 >> pcib1: failed to allocate initial memory window: 0-0xfffff >> pci1: <PCI bus> on pcib1 >> nvme0: <Generic NVMe Device> at device 0.0 on pci1 >> Fatal data abort: >> x0: 0 >> x1: 1000 >> x2: 10040 >> x3: 2000 >> x4: 1 >> x5: ffff00009a7e0168 >> x6: 1400000000000000 >> x7: 10000000000000 >> x8: 1168 >> x9: 1 >> x10: 0 >> x11: ffff000000e8c8c0 >> x12: ffff000000e8c840 >> x13: 1 >> x14: 10000 >> x15: 1 >> x16: 10000 >> x17: ffff000000e8c85c >> x18: ffff000001064180 >> x19: ffff000001064248 >> x20: 0 >> x21: ffff00009a7df000 >> x22: ffffa0000102ea00 >> x23: ffffa00000bb6b80 >> x24: ffffa00001086200 >> x25: ffff000000aa8478 >> x26: ffffa00001086300 >> x27: ffff000000dda000 >> x28: 7 >> x29: ffff000001064190 >> sp: ffff000001064180 >> lr: ffff00000075f20c >> elr: ffff00000078a654 >> spsr: 200000c5 >> far: 0 >> esr: 96000004 >> panic: vm_fault failed: ffff00000078a654 error 1 >> cpuid = 0 >> time = 1 >> KDB: stack backtrace: >> db_trace_self() at db_trace_self >> db_trace_self_wrapper() at db_trace_self_wrapper+0x30 >> vpanic() at vpanic+0x184 >> panic() at panic+0x44 >> data_abort() at data_abort+0x23c >> handle_el1h_sync() at handle_el1h_sync+0x78 >> --- exception, esr 0x96000004 >> iommu_map_msi() at iommu_map_msi+0x20 >> gicv3_iommu_init() at gicv3_iommu_init+0x4c >> intr_alloc_msix() at intr_alloc_msix+0x13c >> rk_pcie_alloc_msix() at rk_pcie_alloc_msix+0xfc >> pci_alloc_msix_method() at pci_alloc_msix_method+0x1a8 >> nvme_pci_attach() at nvme_pci_attach+0x378 >> device_attach() at device_attach+0x400 >> device_probe_and_attach() at device_probe_and_attach+0x7c >> bus_generic_attach() at bus_generic_attach+0x18 >> pci_attach() at pci_attach+0xe8 >> device_attach() at device_attach+0x400 >> device_probe_and_attach() at device_probe_and_attach+0x7c >> bus_generic_attach() at bus_generic_attach+0x18 >> device_attach() at device_attach+0x400 >> device_probe_and_attach() at device_probe_and_attach+0x7c >> bus_generic_attach() at bus_generic_attach+0x18 >> pci_attach() at pci_attach+0xe8 >> device_attach() at device_attach+0x400 >> device_probe_and_attach() at device_probe_and_attach+0x7c >> bus_generic_attach() at bus_generic_attach+0x18 >> rk_pcie_attach() at rk_pcie_attach+0x14cc >> device_attach() at device_attach+0x400 >> device_probe_and_attach() at device_probe_and_attach+0x7c >> bus_generic_new_pass() at bus_generic_new_pass+0xf8 >> bus_generic_new_pass() at bus_generic_new_pass+0xa8 >> bus_generic_new_pass() at bus_generic_new_pass+0xa8 >> bus_set_pass() at bus_set_pass+0x4c >> mi_startup() at mi_startup+0x12c >> virtdone() at virtdone+0x6c >> >> /jsm >> >> >> On 23.09.2021 09.19, Emmanuel Vadot wrote: >>> On Sat, 18 Sep 2021 13:15:45 +0200 >>> Jesper Schmitz Mouridsen <jsm@FreeBSD.org> wrote: >>> >>>> Hi >>>> >>>> Perhaps this one >>>> https://www.mail-archive.com/svn-src-head@freebsd.org/msg126068.html is >>>> giving troubles? >>>> >>>> main-n249225-f673cc5edac3-dirty >>>> nvme0: <Generic NVMe Device> at device 0.0 on pci1 >>>> Fatal data abort: >>>> x0: 0 >>>> x1: 1000 >>>> x2: 10040 >>>> x3: 2000 >>>> x4: 1 >>>> x5: ffff00009a7a0168 >>>> x6: 1d00000000000000 >>>> x7: 10000000000000 >>>> x8: 1168 >>>> x9: 1 >>>> x10: 0 >>>> x11: ffff000000f35140 >>>> x12: ffff000000f350c0 >>>> x13: 1 >>>> x14: 10000 >>>> x15: 1 >>>> x16: 10000 >>>> x17: ffff000000f350dc >>>> x18: ffff00000110d180 >>>> x19: ffff00000110d248 >>>> x20: 0 >>>> x21: ffff00009a79f000 >>>> x22: ffffa000010b0a00 >>>> x23: ffffa000010a2880 >>>> x24: ffffa0000116da00 >>>> x25: ffff000000b4fd78 >>>> x26: ffffa0000116db00 >>>> x27: ffff000000e83000 >>>> x28: 7 >>>> x29: ffff00000110d190 >>>> sp: ffff00000110d180 >>>> lr: ffff00000077520c >>>> elr: ffff0000007a03ac >>>> spsr: 200000c5 >>>> far: 0 >>>> esr: 96000004 >>>> panic: vm_fault failed: ffff0000007a03ac error 1 >>>> cpuid = 0 >>>> time = 1 >>>> KDB: stack backtrace: >>>> db_trace_self() at db_trace_self >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x30 >>>> vpanic() at vpanic+0x184 >>>> panic() at panic+0x44 >>>> data_abort() at data_abort+0x23c >>>> handle_el1h_sync() at handle_el1h_sync+0x78 >>>> --- exception, esr 0x96000004 >>>> iommu_map_msi() at iommu_map_msi+0x20 >>>> gicv3_iommu_init() at gicv3_iommu_init+0x4c >>>> intr_alloc_msix() at intr_alloc_msix+0x13c >>>> rk_pcie_alloc_msix() at rk_pcie_alloc_msix+0xfc >>>> pci_alloc_msix_method() at pci_alloc_msix_method+0x1a8 >>>> nvme_pci_attach() at nvme_pci_attach+0x378 >>>> device_attach() at device_attach+0x400 >>>> device_probe_and_attach() at device_probe_and_attach+0x7c >>>> bus_generic_attach() at bus_generic_attach+0x18 >>>> pci_attach() at pci_attach+0xe8 >>>> device_attach() at device_attach+0x400 >>>> device_probe_and_attach() at device_probe_and_attach+0x7c >>>> bus_generic_attach() at bus_generic_attach+0x18 >>>> device_attach() at device_attach+0x400 >>>> device_probe_and_attach() at device_probe_and_attach+0x7c >>>> bus_generic_attach() at bus_generic_attach+0x18 >>>> pci_attach() at pci_attach+0xe8 >>>> device_attach() at device_attach+0x400 >>>> device_probe_and_attach() at device_probe_and_attach+0x7c >>>> bus_generic_attach() at bus_generic_attach+0x18 >>>> rk_pcie_attach() at rk_pcie_attach+0x14cc >>>> device_attach() at device_attach+0x400 >>>> device_probe_and_attach() at device_probe_and_attach+0x7c >>>> bus_generic_new_pass() at bus_generic_new_pass+0xf8 >>>> bus_generic_new_pass() at bus_generic_new_pass+0xa8 >>>> bus_generic_new_pass() at bus_generic_new_pass+0xa8 >>>> bus_set_pass() at bus_set_pass+0x4c >>>> mi_startup() at mi_startup+0x12c >>>> virtdone() at virtdone+0x6c >>>> >>> That's an old commit. Did you see this panic only recently or ? >>> >> > > > Even on stable/13-n247374-9faebc1e664d-dirty > > I get the same backtrace when IOMMU is enabled and the nvme is attached. > > pcib1: <PCI-PCI bridge> at device 0.0 on pci0 > pcib0: failed to reserve resource for pcib1 > pcib1: failed to allocate initial memory window: 0-0xfffff > pci1: <PCI bus> on pcib1 > nvme0: <Generic NVMe Device> at device 0.0 on pci1 > Fatal data abort: > x0: 0 > x1: 1000 > x2: 10040 > x3: 2000 > x4: 1 > x5: ffff00009a99e160 > x6: 1400000000000000 > x7: 10000000000000 > x8: 1160 > x9: ffff000000cd7cc0 > x10: 0 > x11: ffff000000d89540 > x12: ffff000000d894c0 > x13: 1 > x14: 10000 > x15: 1 > x16: 10000 > x17: 0 > x18: ffff000000f5c250 > x19: ffff000000f5c318 > x20: 0 > x21: ffff00009a99d000 > x22: ffffa00000f06200 > x23: ffffa00000f49700 > x24: ffffa00000f8f500 > x25: ffff0000009b85f8 > x26: ffffa00000f8f600 > x27: ffff000000cd7000 > x28: 7 > x29: ffff000000f5c260 > sp: ffff000000f5c250 > lr: ffff0000006bf3dc > elr: ffff0000006e15d0 > spsr: 600001c5 > far: 0 > esr: 96000004 > panic: vm_fault failed: ffff0000006e15d0 > cpuid = 0 > time = 1 > KDB: stack backtrace: > #0 0xffff00000047c304 at kdb_backtrace+0x60 > #1 0xffff000000437fd8 at vpanic+0x184 > #2 0xffff000000437e50 at panic+0x44 > #3 0xffff0000006d692c at data_abort+0x204 > #4 0xffff0000006bb874 at handle_el1h_sync+0x74 > #5 0xffff0000006bf3d8 at gicv3_iommu_init+0x4c > #6 0xffff0000006bf3d8 at gicv3_iommu_init+0x4c > #7 0xffff0000006b1940 at intr_alloc_msix+0x110 > #8 0xffff0000007860c0 at rk_pcie_alloc_msix+0xfc > #9 0xffff000000219bbc at pci_alloc_msix_method+0x1a8 > #10 0xffff00000020ba64 at nvme_pci_attach+0x378 > #11 0xffff00000046bd80 at device_attach+0x400 > #12 0xffff00000046d14c at bus_generic_attach+0x4c > #13 0xffff000000221f30 at pci_attach+0xe0 > #14 0xffff00000046bd80 at device_attach+0x400 > #15 0xffff00000046d14c at bus_generic_attach+0x4c > #16 0xffff00000046bd80 at device_attach+0x400 > #17 0xffff00000046d14c at bus_generic_attach+0x4c > Uptime: 1s > > git checkout 50cedfede3d21824ec6023324b3ad41a435e1815 sys/arm64/arm64/gicv3_its.c and the problem goes away. The commit is one before Add IOMMU support to GICv3 Interrupt Translation Service (ITS) driver. (ba196aec7dad1b73a9a3b86a06259d5e81f16fad)