Re: Recent commits reject RPi4B booting: pcib0 vs. pcib1 "rman_manage_region: <pcib1 memory window> request" leads to panic
Date: Mon, 12 Feb 2024 17:45:27 UTC
On 2/12/24 12:36, John Baldwin wrote: > On 2/10/24 2:09 PM, Michael Butler wrote: >> I have stability problems with anything at or after this commit >> (b377ff8) on an amd64 laptop. While I see the following panic logged, no >> crash dump is preserved :-( It happens after ~5-6 minutes running in KDE >> (X). >> >> Reverting to 36efc64 seems to work reliably (after ACPI changes but >> before the problematic PCI one) >> >> kernel: Fatal trap 12: page fault while in kernel mode >> kernel: cpuid = 2; apic id = 02 >> kernel: fault virtual address = 0x48 >> kernel: fault code = supervisor read data, page not >> present >> kernel: instruction pointer = 0x20:0xffffffff80acb962 >> kernel: stack pointer = 0x28:0xfffffe00c4318d80 >> kernel: frame pointer = 0x28:0xfffffe00c4318d80 >> kernel: code segment = base 0x0, limit 0xfffff, type 0x1b >> kernel: = DPL 0, pres 1, long 1, def32 0, gran 1 >> kernel: processor eflags = interrupt enabled, resume, IOPL = 0 >> kernel: current process = 2 (clock (0)) >> kernel: rdi: fffff802e460c000 rsi: 0000000000000000 rdx: 0000000000000002 >> kernel: rcx: 0000000000000000 r8: 000000000000001e r9: fffffe00c4319000 >> kernel: rax: 0000000000000002 rbx: fffff802e460c000 rbp: fffffe00c4318d80 >> kernel: r10: 0000000000001388 r11: 000000007ffc765d r12: 000f000000000000 >> kernel: r13: 0002000000000000 r14: fffff8000193e740 r15: 0000000000000000 >> kernel: trap number = 12 >> kernel: panic: page fault >> kernel: cpuid = 2 >> kernel: time = 1707573802 >> kernel: Uptime: 6m19s >> kernel: Dumping 942 out of 16242 >> MB:..2%..11%..21%..31%..41%..51%..62%..72%..82%..92% >> kernel: Dump complete >> kernel: Automatic reboot in 15 seconds - press a key on the console to >> abort > > Without a stack trace it is pretty much impossible to debug a panic like > this. > Do you have KDB_TRACE enabled in your kernel config? I'm also not sure > how the > PCI changes can result in a panic post-boot. If you were going to have > problems > they would be during device attach, not after you are booted and running X. > > Short of a stack trace, you can at least use lldb or gdb to lookup the > source > line associated with the faulting instruction pointer (as long as it > isn't in > a kernel module), e.g. for gdb you would use 'gdb /boot/kernel/kernel' > and then > 'l *<instruction pointer address>', e.g. from above: 'l > *0xffffffff80acb962' I suspect the absence of a core dump was due to my use of tmpfs for /tmp and /var/tmp while still having clear_tmp enabled in rc.conf (that may touch swap on restart). Since then, I've removed tmpfs, everything under /usr/obj am rebuilding from scratch. I'll update when it finally finishes (i5-3340s are quick :-() Michael