Re: ZFS + FreeBSD XEN dom0 panic

Reply: Roger Pau Monné : "Re: ZFS + FreeBSD XEN dom0 panic"
In reply to: Ze Dupsys : "Re: ZFS + FreeBSD XEN dom0 panic"
Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Roger Pau Monné <roger.pau_at_citrix.com>
Date: Mon, 11 Apr 2022 15:37:27 UTC
On Mon, Apr 11, 2022 at 11:47:50AM +0300, Ze Dupsys wrote:
> On 2022.04.08. 18:02, Roger Pau Monné wrote:
> > On Fri, Apr 08, 2022 at 10:45:12AM +0300, Ze Dupsys wrote:
> > > On 2022.04.05. 18:22, Roger Pau Monné wrote:
> > > > .. Thanks, sorry for the late reply, somehow the message slip.
> > > > 
> > > > I've been able to get the file:line for those, and the trace is kind
> > > > of weird, I'm not sure I know what's going on TBH. It seems to me the
> > > > backend instance got freed while being in the process of connecting.
> > > > 
> > > > I've made some changes, that might mitigate this, but having not a
> > > > clear understanding of what's going on makes this harder.
> > > > 
> > > > I've pushed the changes to:
> > > > 
> > > > http://xenbits.xen.org/gitweb/?p=people/royger/freebsd.git;a=shortlog;h=refs/heads/for-leak
> > > > 
> > > > (This is on top of main branch).
> > > > 
> > > > I'm also attaching the two patches on this email.
> > > > 
> > > > Let me know if those make a difference to stabilize the system.
> > > 
> > > Hi,
> > > 
> > > Yes, it stabilizes the system, but there is still a memleak somewhere, i
> > > think.
> > > 
> > > System could run tests for approximately 41 hour, did not panic, but started
> > > to OOM kill everything.
> > > 
> > > I did not know how to git clone given commit, thus i just applied patches to
> > > 13.0-RELEASE sources.
> > > 
> > > Serial logs have nothing unusual, just that at some point OOM kill starts.
> > 
> > Well, I think that's good^W better than before. Thanks again for all
> > the testing.
> > 
> > It might be helpful now to start dumping `vmstat -m` periodically
> > while running the stress tests. As there are (hopefully) no more
> > panics now vmstat might report us what subsystem is hogging the
> > memory. It's possible it's blkback (again).
> > 
> > Thanks, Roger.
> > 
> 
> Yes, it certainly is better. Applied patch on my pre-production server, have
> not had any panic since then, still testing though.
> 
> On my stressed lab server, it's a bit different story. On occasion i see a
> panic with this trace on serial (can not reliably repeat, but sometimes upon
> starting dom id 1 and 2, sometimes mid-stress-test, dom id > 95).
> panic: pmap_growkernel: no memory to grow kernel
> cpuid = 2
> time = 1649485133
> KDB: stack backtrace:
> #0 0xffffffff80c57385 at kdb_backtrace+0x65
> #1 0xffffffff80c09d61 at vpanic+0x181
> #2 0xffffffff80c09bd3 at panic+0x43
> #3 0xffffffff81073eed at pmap_growkernel+0x27d
> #4 0xffffffff80f2d918 at vm_map_insert+0x248
> #5 0xffffffff80f30079 at vm_map_find+0x549
> #6 0xffffffff80f2bda6 at kmem_init+0x226
> #7 0xffffffff80c731a1 at vmem_xalloc+0xcb1
> #8 0xffffffff80c72a9b at vmem_xalloc+0x5ab
> #9 0xffffffff80c724a6 at vmem_alloc+0x46
> #10 0xffffffff80f2ac6b at kva_alloc+0x2b
> #11 0xffffffff8107f0eb at pmap_mapdev_attr+0x27b
> #12 0xffffffff810588ca at nexus_add_irq+0x65a
> #13 0xffffffff81058710 at nexus_add_irq+0x4a0
> #14 0xffffffff810585b9 at nexus_add_irq+0x349
> #15 0xffffffff80c495c1 at bus_alloc_resource+0xa1
> #16 0xffffffff8105e940 at xenmem_free+0x1a0
> #17 0xffffffff80a7e0dd at xbd_instance_create+0x943d
> 
> | sed -Ee 's/^#[0-9]* //' -e 's/ .*//' | xargs addr2line -e
> /usr/lib/debug/boot/kernel/kernel.debug
> /usr/src/sys/kern/subr_kdb.c:443
> /usr/src/sys/kern/kern_shutdown.c:0
> /usr/src/sys/kern/kern_shutdown.c:843
> /usr/src/sys/amd64/amd64/pmap.c:0
> /usr/src/sys/vm/vm_map.c:0
> /usr/src/sys/vm/vm_map.c:0
> /usr/src/sys/vm/vm_kern.c:712
> /usr/src/sys/kern/subr_vmem.c:928
> /usr/src/sys/kern/subr_vmem.c:0
> /usr/src/sys/kern/subr_vmem.c:1350
> /usr/src/sys/vm/vm_kern.c:150
> /usr/src/sys/amd64/amd64/pmap.c:0
> /usr/src/sys/x86/x86/nexus.c:0
> /usr/src/sys/x86/x86/nexus.c:449
> /usr/src/sys/x86/x86/nexus.c:412
> /usr/src/sys/kern/subr_bus.c:4620
> /usr/src/sys/x86/xen/xenpv.c:123
> /usr/src/sys/dev/xen/blkback/blkback.c:3010
> 
> With gdb backtrace i think i can get a better trace though:
> #0  __curthread at /usr/src/sys/amd64/include/pcpu_aux.h:55
> #1  doadump at /usr/src/sys/kern/kern_shutdown.c:399
> #2  kern_reboot at /usr/src/sys/kern/kern_shutdown.c:486
> #3  vpanic at /usr/src/sys/kern/kern_shutdown.c:919
> #4  panic at /usr/src/sys/kern/kern_shutdown.c:843
> #5  pmap_growkernel at /usr/src/sys/amd64/amd64/pmap.c:208
> #6  vm_map_insert at /usr/src/sys/vm/vm_map.c:1752
> #7  vm_map_find at /usr/src/sys/vm/vm_map.c:2259
> #8  kva_import at /usr/src/sys/vm/vm_kern.c:712
> #9  vmem_import at /usr/src/sys/kern/subr_vmem.c:928
> #10 vmem_try_fetch at /usr/src/sys/kern/subr_vmem.c:1049
> #11 vmem_xalloc at /usr/src/sys/kern/subr_vmem.c:1449
> #12 vmem_alloc at /usr/src/sys/kern/subr_vmem.c:1350
> #13 kva_alloc at /usr/src/sys/vm/vm_kern.c:150
> #14 pmap_mapdev_internal at /usr/src/sys/amd64/amd64/pmap.c:8974
> #15 pmap_mapdev_attr at /usr/src/sys/amd64/amd64/pmap.c:8990
> #16 nexus_map_resource at /usr/src/sys/x86/x86/nexus.c:523
> #17 nexus_activate_resource at /usr/src/sys/x86/x86/nexus.c:448
> #18 nexus_alloc_resource at /usr/src/sys/x86/x86/nexus.c:412
> #19 BUS_ALLOC_RESOURCE at ./bus_if.h:321
> #20 bus_alloc_resource at /usr/src/sys/kern/subr_bus.c:4617
> #21 xenpv_alloc_physmem at /usr/src/sys/x86/xen/xenpv.c:121
> #22 xbb_alloc_communication_mem at
> /usr/src/sys/dev/xen/blkback/blkback.c:3010
> #23 xbb_connect at /usr/src/sys/dev/xen/blkback/blkback.c:3336
> #24 xenbusb_back_otherend_changed at
> /usr/src/sys/xen/xenbus/xenbusb_back.c:228
> #25 xenwatch_thread at /usr/src/sys/dev/xen/xenstore/xenstore.c:1003
> #26 in fork_exit at /usr/src/sys/kern/kern_fork.c:1069
> #27 <signal handler called>
> 
> 
> There is some sort of mismatch in info, because panic message printed
> "panic: pmap_growkernel: no memory to grow kernel", but gdb backtrace in
> #5  0xffffffff81073eed in pmap_growkernel at
> /usr/src/sys/amd64/amd64/pmap.c:208
> leads to lines:
> switch (pmap->pm_type) {
> ..
> panic("pmap_valid_bit: invalid pm_type %d", pmap->pm_type)
> 
> So either trace is off the mark or message in serial logs. If this was only
> memleak related, then it should not happen when dom id 1 is started, i
> suppose.

That's weird, I would rather trust the printed panic message rather
than the symbol resolution.  Seems to be a kind of memory exhaustion,
as the kernel is failing to allocate a page for use in the kernel page
table.

I will try to see what can be done here.

Thanks, Roger.