Re: ZFS + FreeBSD XEN dom0 panic

Reply: Ze Dupsys : "Re: ZFS + FreeBSD XEN dom0 panic"
In reply to: Roger Pau Monné : "Re: ZFS + FreeBSD XEN dom0 panic"
Go to: [ bottom of page ] [ top of archives ] [ this month ]
From: Roger Pau Monné <roger.pau_at_citrix.com>
Date: Tue, 12 Apr 2022 15:37:48 UTC
On Mon, Apr 11, 2022 at 05:37:27PM +0200, Roger Pau Monné wrote:
> On Mon, Apr 11, 2022 at 11:47:50AM +0300, Ze Dupsys wrote:
> > On 2022.04.08. 18:02, Roger Pau Monné wrote:
> > > On Fri, Apr 08, 2022 at 10:45:12AM +0300, Ze Dupsys wrote:
> > > > On 2022.04.05. 18:22, Roger Pau Monné wrote:
> > > > > .. Thanks, sorry for the late reply, somehow the message slip.
> > > > > 
> > > > > I've been able to get the file:line for those, and the trace is kind
> > > > > of weird, I'm not sure I know what's going on TBH. It seems to me the
> > > > > backend instance got freed while being in the process of connecting.
> > > > > 
> > > > > I've made some changes, that might mitigate this, but having not a
> > > > > clear understanding of what's going on makes this harder.
> > > > > 
> > > > > I've pushed the changes to:
> > > > > 
> > > > > http://xenbits.xen.org/gitweb/?p=people/royger/freebsd.git;a=shortlog;h=refs/heads/for-leak
> > > > > 
> > > > > (This is on top of main branch).
> > > > > 
> > > > > I'm also attaching the two patches on this email.
> > > > > 
> > > > > Let me know if those make a difference to stabilize the system.
> > > > 
> > > > Hi,
> > > > 
> > > > Yes, it stabilizes the system, but there is still a memleak somewhere, i
> > > > think.
> > > > 
> > > > System could run tests for approximately 41 hour, did not panic, but started
> > > > to OOM kill everything.
> > > > 
> > > > I did not know how to git clone given commit, thus i just applied patches to
> > > > 13.0-RELEASE sources.
> > > > 
> > > > Serial logs have nothing unusual, just that at some point OOM kill starts.
> > > 
> > > Well, I think that's good^W better than before. Thanks again for all
> > > the testing.
> > > 
> > > It might be helpful now to start dumping `vmstat -m` periodically
> > > while running the stress tests. As there are (hopefully) no more
> > > panics now vmstat might report us what subsystem is hogging the
> > > memory. It's possible it's blkback (again).
> > > 
> > > Thanks, Roger.
> > > 
> > 
> > Yes, it certainly is better. Applied patch on my pre-production server, have
> > not had any panic since then, still testing though.
> > 
> > On my stressed lab server, it's a bit different story. On occasion i see a
> > panic with this trace on serial (can not reliably repeat, but sometimes upon
> > starting dom id 1 and 2, sometimes mid-stress-test, dom id > 95).
> > panic: pmap_growkernel: no memory to grow kernel
> > cpuid = 2
> > time = 1649485133
> > KDB: stack backtrace:
> > #0 0xffffffff80c57385 at kdb_backtrace+0x65
> > #1 0xffffffff80c09d61 at vpanic+0x181
> > #2 0xffffffff80c09bd3 at panic+0x43
> > #3 0xffffffff81073eed at pmap_growkernel+0x27d
> > #4 0xffffffff80f2d918 at vm_map_insert+0x248
> > #5 0xffffffff80f30079 at vm_map_find+0x549
> > #6 0xffffffff80f2bda6 at kmem_init+0x226
> > #7 0xffffffff80c731a1 at vmem_xalloc+0xcb1
> > #8 0xffffffff80c72a9b at vmem_xalloc+0x5ab
> > #9 0xffffffff80c724a6 at vmem_alloc+0x46
> > #10 0xffffffff80f2ac6b at kva_alloc+0x2b
> > #11 0xffffffff8107f0eb at pmap_mapdev_attr+0x27b
> > #12 0xffffffff810588ca at nexus_add_irq+0x65a
> > #13 0xffffffff81058710 at nexus_add_irq+0x4a0
> > #14 0xffffffff810585b9 at nexus_add_irq+0x349
> > #15 0xffffffff80c495c1 at bus_alloc_resource+0xa1
> > #16 0xffffffff8105e940 at xenmem_free+0x1a0
> > #17 0xffffffff80a7e0dd at xbd_instance_create+0x943d
> > 
> > | sed -Ee 's/^#[0-9]* //' -e 's/ .*//' | xargs addr2line -e
> > /usr/lib/debug/boot/kernel/kernel.debug
> > /usr/src/sys/kern/subr_kdb.c:443
> > /usr/src/sys/kern/kern_shutdown.c:0
> > /usr/src/sys/kern/kern_shutdown.c:843
> > /usr/src/sys/amd64/amd64/pmap.c:0
> > /usr/src/sys/vm/vm_map.c:0
> > /usr/src/sys/vm/vm_map.c:0
> > /usr/src/sys/vm/vm_kern.c:712
> > /usr/src/sys/kern/subr_vmem.c:928
> > /usr/src/sys/kern/subr_vmem.c:0
> > /usr/src/sys/kern/subr_vmem.c:1350
> > /usr/src/sys/vm/vm_kern.c:150
> > /usr/src/sys/amd64/amd64/pmap.c:0
> > /usr/src/sys/x86/x86/nexus.c:0
> > /usr/src/sys/x86/x86/nexus.c:449
> > /usr/src/sys/x86/x86/nexus.c:412
> > /usr/src/sys/kern/subr_bus.c:4620
> > /usr/src/sys/x86/xen/xenpv.c:123
> > /usr/src/sys/dev/xen/blkback/blkback.c:3010
> > 
> > With gdb backtrace i think i can get a better trace though:
> > #0  __curthread at /usr/src/sys/amd64/include/pcpu_aux.h:55
> > #1  doadump at /usr/src/sys/kern/kern_shutdown.c:399
> > #2  kern_reboot at /usr/src/sys/kern/kern_shutdown.c:486
> > #3  vpanic at /usr/src/sys/kern/kern_shutdown.c:919
> > #4  panic at /usr/src/sys/kern/kern_shutdown.c:843
> > #5  pmap_growkernel at /usr/src/sys/amd64/amd64/pmap.c:208
> > #6  vm_map_insert at /usr/src/sys/vm/vm_map.c:1752
> > #7  vm_map_find at /usr/src/sys/vm/vm_map.c:2259
> > #8  kva_import at /usr/src/sys/vm/vm_kern.c:712
> > #9  vmem_import at /usr/src/sys/kern/subr_vmem.c:928
> > #10 vmem_try_fetch at /usr/src/sys/kern/subr_vmem.c:1049
> > #11 vmem_xalloc at /usr/src/sys/kern/subr_vmem.c:1449
> > #12 vmem_alloc at /usr/src/sys/kern/subr_vmem.c:1350
> > #13 kva_alloc at /usr/src/sys/vm/vm_kern.c:150
> > #14 pmap_mapdev_internal at /usr/src/sys/amd64/amd64/pmap.c:8974
> > #15 pmap_mapdev_attr at /usr/src/sys/amd64/amd64/pmap.c:8990
> > #16 nexus_map_resource at /usr/src/sys/x86/x86/nexus.c:523
> > #17 nexus_activate_resource at /usr/src/sys/x86/x86/nexus.c:448
> > #18 nexus_alloc_resource at /usr/src/sys/x86/x86/nexus.c:412
> > #19 BUS_ALLOC_RESOURCE at ./bus_if.h:321
> > #20 bus_alloc_resource at /usr/src/sys/kern/subr_bus.c:4617
> > #21 xenpv_alloc_physmem at /usr/src/sys/x86/xen/xenpv.c:121
> > #22 xbb_alloc_communication_mem at
> > /usr/src/sys/dev/xen/blkback/blkback.c:3010
> > #23 xbb_connect at /usr/src/sys/dev/xen/blkback/blkback.c:3336
> > #24 xenbusb_back_otherend_changed at
> > /usr/src/sys/xen/xenbus/xenbusb_back.c:228
> > #25 xenwatch_thread at /usr/src/sys/dev/xen/xenstore/xenstore.c:1003
> > #26 in fork_exit at /usr/src/sys/kern/kern_fork.c:1069
> > #27 <signal handler called>
> > 
> > 
> > There is some sort of mismatch in info, because panic message printed
> > "panic: pmap_growkernel: no memory to grow kernel", but gdb backtrace in
> > #5  0xffffffff81073eed in pmap_growkernel at
> > /usr/src/sys/amd64/amd64/pmap.c:208
> > leads to lines:
> > switch (pmap->pm_type) {
> > ..
> > panic("pmap_valid_bit: invalid pm_type %d", pmap->pm_type)
> > 
> > So either trace is off the mark or message in serial logs. If this was only
> > memleak related, then it should not happen when dom id 1 is started, i
> > suppose.
> 
> That's weird, I would rather trust the printed panic message rather
> than the symbol resolution.  Seems to be a kind of memory exhaustion,
> as the kernel is failing to allocate a page for use in the kernel page
> table.
> 
> I will try to see what can be done here.

I have a patch to disable the bounce buffering done in blkback
(attached).

While I think it's not directly related to the panic you are hitting,
it's long time since we should have disabled that.  It should reduce
the memory consumption by blkback greatly, so might have the side
effect of helping with your issue related to pmap_growkernel.

On my test box a single instance of blkback reduced memory usage from
~100M to ~300K.

It should be applied on top of the other two patches.

Regards, Roger.