Re: ZFS + FreeBSD XEN dom0 panic

From: Ze Dupsys <zedupsys_at_gmail.com>
Date: Thu, 14 Apr 2022 07:20:25 UTC
On 2022.04.05. 18:22, Roger Pau Monné wrote:
> I've pushed the changes to:
> 
> http://xenbits.xen.org/gitweb/?p=people/royger/freebsd.git;a=shortlog;h=refs/heads/for-leak
> 
> (This is on top of main branch).
> 
> I'm also attaching the two patches on this email.
> 
> Let me know if those make a difference to stabilize the system.
> 

I do not know should i start a new thread, but i have captured another 
panic, new trace, this is on different machine, similar setup, 
RELEASE-13.0 + 2 mentioned patches.

I do not know how to reliably repeat it, nor the cause. But i have 
suspicion that this happens when doing some of steps like: create new 
ZVOL, turn one VM off, add new HDD/ZVOL path to VM in cfg file, start VM 
back up, inside this VM do some HDD load on newly added HDD (install 
stuff, extract data, etc.) + something of: shut all VMs down one by one, 
then do init 0 or 6, or create new other VM. On this machine i can't 
experiment too much, no serial output available either.


Fatal trap 12: page fault while in kernel mode
cpuid = 3; apic id = 06
fault virtual address   = 0x68
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff821dc99d
stack pointer           = 0x28:0xfffffe00c6b497d0
frame pointer           = 0x28:0xfffffe00c6b49870
code segment            = base 0x0, limit 0xfffff, type 0x1b
                         = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (xbbd26 taskq)
trap number             = 12
panic: page fault
cpuid = 3
time = 1649915274
KDB: stack backtrace:
#0 0xffffffff80c57385 at kdb_backtrace+0x65
#1 0xffffffff80c09d61 at vpanic+0x181
#2 0xffffffff80c09bd3 at panic+0x43
#3 0xffffffff8108b187 at trap+0xbc7
#4 0xffffffff8108b1df at trap+0xc1f
#5 0xffffffff8108a83d at trap+0x27d
#6 0xffffffff81061818 at calltrap+0x8
#7 0xffffffff821c035a at dmu_read+0x2a
#8 0xffffffff8218da3a at zvol_geom_bio_strategy+0x2aa
#9 0xffffffff80a7f074 at xbd_instance_create+0xa3d4
#10 0xffffffff80a7b00a at xbd_instance_create+0x636a
#11 0xffffffff80c6b021 at taskqueue_run+0x2a1
#12 0xffffffff80c6c33c at taskqueue_thread_loop+0xac
#13 0xffffffff80bc7c9e at fork_exit+0x7e
#14 0xffffffff8106289e at fork_trampoline+0xe
Uptime: 24m0s
(ada0:ahcich0:0:0:0): spin-down
(ada1:ahcich1:0:0:0): spin-down
(ada2:ahcich2:0:0:0): spin-down
Dumping 2922 out of 6104



cat panic.log| sed -Ee 's/^#[0-9]* //' -e 's/ .*//' | xargs addr2line -e 
/usr/lib/debug/boot/kernel/kernel.debug
/usr/src/sys/kern/subr_bus.c:2410
/usr/src/sys/kern/kern_racct.c:632
/usr/src/sys/kern/kern_racct.c:617
/usr/src/sys/dev/isci/isci_sysctl.c:92
/usr/src/sys/dev/isci/isci_sysctl.c:0
/usr/src/sys/dev/isci/isci_oem_parameters.c:130
/usr/src/sys/dev/hyperv/input/hv_kbd.c:540
??:0
??:0
/usr/src/sys/dev/xen/blkback/blkback.c:3083
/usr/src/sys/xen/xenbus/xenbusvar.h:96
/usr/src/sys/kern/subr_kobj.c:145
/usr/src/sys/kern/subr_module.c:255
/usr/src/sys/kern/kern_event.c:0
/usr/src/sys/dev/hyperv/pcib/vmbus_pcib.c:1158


Full output of (kgdb) backtrace
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=<optimized out>) at 
/usr/src/sys/kern/kern_shutdown.c:399
#2  0xffffffff80c09956 in kern_reboot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:486
#3  0xffffffff80c09dd0 in vpanic (fmt=<optimized out>, ap=<optimized 
out>) at /usr/src/sys/kern/kern_shutdown.c:919
#4  0xffffffff80c09bd3 in panic (fmt=<unavailable>) at 
/usr/src/sys/kern/kern_shutdown.c:843
#5  0xffffffff8108b187 in trap_fatal (frame=0xfffffe00c6b49710, eva=104) 
at /usr/src/sys/amd64/amd64/trap.c:915
#6  0xffffffff8108b1df in trap_pfault 
(frame=frame@entry=0xfffffe00c6b49710, usermode=false, signo=<optimized 
out>, signo@entry=0x0, ucode=<optimized out>, ucode@entry=0x0) at 
/usr/src/sys/amd64/amd64/trap.c:732
#7  0xffffffff8108a83d in trap (frame=0xfffffe00c6b49710) at 
/usr/src/sys/amd64/amd64/trap.c:398
#8  <signal handler called>
#9  0xffffffff821dc99d in dbuf_write_children_ready (zio=<optimized 
out>, buf=<optimized out>, vdb=0x0) at 
/usr/src/sys/contrib/openzfs/module/zfs/dbuf.c:4642
#10 0xffffffff821c035a in arc_evict_impl (state=<optimized out>, 
spa=<optimized out>, bytes=<optimized out>, type=<optimized out>) at 
/usr/src/sys/contrib/openzfs/module/zfs/arc.c:4377
#11 arc_evict_meta_balanced (meta_used=<optimized out>) at 
/usr/src/sys/contrib/openzfs/module/zfs/arc.c:4443
#12 arc_evict_meta (meta_used=<optimized out>) at 
/usr/src/sys/contrib/openzfs/module/zfs/arc.c:4533
#13 arc_evict () at /usr/src/sys/contrib/openzfs/module/zfs/arc.c:4627
#14 arc_evict_cb (arg=<optimized out>, zthr=<optimized out>) at 
/usr/src/sys/contrib/openzfs/module/zfs/arc.c:4938
#15 0xffffffff8218da3a in zfs_deleteextattr (ap=0x1430f6000) at 
/usr/src/sys/contrib/openzfs/module/os/freebsd/zfs/zfs_vnops_os.c:5592
#16 0xffffffff80a7f074 in xbb_dispatch_dev (xbb=0xfffff8011a6ff800, 
reqlist=<optimized out>, operation=<optimized out>, bio_flags=0) at 
/usr/src/sys/dev/xen/blkback/blkback.c:2207
#17 0xffffffff80a7b00a in xbb_dispatch_io (xbb=0xfffff8011a6ff800, 
reqlist=<optimized out>) at /usr/src/sys/dev/xen/blkback/blkback.c:1767
#18 xbb_run_queue (context=0xfffff8011a6ff800, pending=<optimized out>) 
at /usr/src/sys/dev/xen/blkback/blkback.c:1987
#19 0xffffffff80c6b021 in taskqueue_run_locked 
(queue=queue@entry=0xfffff8011a9f1e00) at 
/usr/src/sys/kern/subr_taskqueue.c:476
#20 0xffffffff80c6c33c in taskqueue_thread_loop (arg=<optimized out>, 
arg@entry=0xfffff8011a6ff800) at /usr/src/sys/kern/subr_taskqueue.c:793
#21 0xffffffff80bc7c9e in fork_exit (callout=0xffffffff80c6c290 
<taskqueue_thread_loop>, arg=0xfffff8011a6ff800, 
frame=0xfffffe00c6b49c00) at /usr/src/sys/kern/kern_fork.c:1069
#22 <signal handler called>

If it is better next time to filter function variables from kgdb 
backtrace, let me know. blkback lines don't match again, but kgdb output 
seems more meaningful.

Thanks.