Re: ZFS + FreeBSD XEN dom0 panic
- Reply: Roger Pau Monné : "Re: ZFS + FreeBSD XEN dom0 panic"
- In reply to: Roger Pau Monné : "Re: ZFS + FreeBSD XEN dom0 panic"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Sat, 26 Mar 2022 22:38:00 UTC
On 2022.03.26. 16:38, Roger Pau Monné wrote: > .. > It's weird, because here you get a page fault, but there are also > traces with: > .. > general protection fault while in kernel mode > .. > That show a general protection fault instead of a page fault. Yes indeed, i had not noticed this. Grepped across 34 stored panic log files, i see that 28 are page fault, 4 are general protection fault, 2 other. I though maybe RAM size influences this, but page faults have 2G, 4G, 6G, 8G Dom0, general protection faults have 2G, 4G, 8G. I have no idea what triggers what, since stress tests and command line args are more or less the same. Builds are different with patches, some debug info, etc. Almost all panic traces have "rman_is_region_manager" in mid, actually looking all of them together seemed interesting. I'll attach unique panic traces, since some included snprintf, kvprintf as well, maybe helpful. Unfortunately i do not know which version and what patches were applied. > I've also noticed it seems to always be 'devmatch' the process that > triggers the panic. Yes, it seems to be the case most of the time. There are 3 cases when process is "xbbd* taskq". 2 cases with 2G RAM, 1 with 6G. > I've been able to get a better trace with gdb and your debug symbols, > and this is: > > (gdb) info line *0xffffffff80c6a2b2 > Line 1386 of "/usr/src/sys/kern/subr_bus.c" starts at address 0xffffffff80c6a2b2 <device_get_name+18> > and ends at 0xffffffff80c6a2b6 <device_get_name+22>. > (gdb) info line *0xffffffff80c86ed1 > Line 1052 of "/usr/src/sys/kern/subr_rman.c" starts at address 0xffffffff80c86ecc <sysctl_rman+540> > and ends at 0xffffffff80c86ed5 <sysctl_rman+549>. This is a nice find! > I'm trying to figure out how the device could be removed or > disconnected from the rman. I will try to create a patch to catch the > device that leaves rman regions when destroyed/removed. Okay, i'll apply when it will be possible. I did run xen-debug on system with applied blkback.patch as you sent in next message to this. System had panic with new trace: Fatal trap 12: page fault while in kernel mode cpuid = 2; apic id = 04 fault virtual address = 0xa4 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff80c90ed0 stack pointer = 0x28:0xfffffe0051927ab0 frame pointer = 0x28:0xfffffe0051927ad0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 16 (xenwatch) trap number = 12 panic: page fault cpuid = 1 time = 1648331592 KDB: stack backtrace: #0 0xffffffff80c7c275 at kdb_backtrace+0x65 #1 0xffffffff80c2e2d1 at vpanic+0x181 #2 0xffffffff80c2e143 at panic+0x43 #3 0xffffffff810c8b97 at trap+0xba7 #4 0xffffffff810c8bef at trap+0xbff #5 0xffffffff810c8243 at trap+0x253 #6 0xffffffff810a0838 at calltrap+0x8 #7 0xffffffff80a98515 at xbd_instance_create+0x7895 #8 0xffffffff80a98462 at xbd_instance_create+0x77e2 #9 0xffffffff80a9619b at xbd_instance_create+0x551b #10 0xffffffff80f95c54 at xenbusb_localend_changed+0x7c4 #11 0xffffffff80ab0ef4 at xs_unlock+0x704 #12 0xffffffff80beaede at fork_exit+0x7e #13 0xffffffff810a18ae at fork_trampoline+0xe cat /tmp/panic.log| sed -Ee 's/^#[0-9]* //' -e 's/ .*//' | xargs addr2line -e /usr/lib/debug/boot/kernel/kernel.debug /usr/src/sys/kern/subr_kdb.c:443 /usr/src/sys/kern/kern_shutdown.c:0 /usr/src/sys/kern/kern_shutdown.c:844 /usr/src/sys/amd64/amd64/trap.c:944 /usr/src/sys/amd64/amd64/trap.c:0 /usr/src/sys/amd64/amd64/trap.c:0 /usr/src/sys/amd64/amd64/exception.S:292 /usr/src/sys/dev/xen/blkback/blkback.c:2789 /usr/src/sys/dev/xen/blkback/blkback.c:3431 /usr/src/sys/dev/xen/blkback/blkback.c:3912 /usr/src/sys/xen/xenbus/xenbusb_back.c:238 /usr/src/sys/dev/xen/xenstore/xenstore.c:1007 /usr/src/sys/kern/kern_fork.c:1099 /usr/src/sys/amd64/amd64/exception.S:1091 Full serial log in attachment. Thanks.