[zfs][panic] zio_free_issue at zfs mount of child dataset (ddt_phys_decref?)

Sun Nov 20 16:20:31 UTC 2011

Hello.

Looking for some input on solving a repeating panic at zfs-v28 mount of one of a raidz3 pool's child dataset.

This is amd64 8.2-STABLE running a freshly csup'd kernel build.

System details (obfuscated *.confs can be shared if required):

Xeon E5620, 24GB ECC.
2x  hpt rr2720 sas2 controller
2x  lsi 9211-8i sas2 controller (using lsi's driver, not mps)
1x  hpt rr640 sata3 controller
HDD are HDS 5K3000 (raidz3)
SSD are OCZ Velocity3 (mirrored ZIL,L2ARC)

This pool is built on top of individual GELIs (so any *solaris pool hackery won't be of use)

Appears to import fine, zpool status -v is clean, zdb (and zdb -cv) don't spout complaints ...

Mounting the parent works OK:
	bsd# zfs mount thePOOL

Mounting some children works OK:
	bsd# zfs mount thePOOL/dataset-foo
	bsd# zfs mount thePOOL/dataset-bar
	bsd# zfs mount thePOOL/dataset-baz

However while attempting to  mount one in particular ... kaboom!
	bsd# zfs mount thePOOL/dataset-xyzzy

(order doesn't seem to matter, xyzzy is always the problem child...)

System immediately  (and repeatably) panics.

Transcribed from camera shot:

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address   = 0x30
fault code              = supervisor write data, page not present
instruction pointer     = 0x20:0xffffffff80dc48b1
stack pointer           = 0x28:0xffffff89190d8b00
frame pointer           = 0x28:0xffffff89190d8b30
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 0 (zio_free_issue_1)

Stopped at      ddt_phys_decref+0x1:    subq    $0x1,0x30(%rdi)

db> where

Tracing pid 0 tid 100356 td 0xffffff001b8a3000
ddt_phys_decref() at zio_execute+0xc3
taskqueue_run_locked() at taskqueue_run_locked+0x93
taskqueue_thread_loop() at taskqueue_thread_loop+0x3f
for_exit() at fork_exit+0x135
fork_trampoline() at for_trampoline+0xc

an alltrace snippet from db:

Tracing command zfs pid 117 tid 100296 td 0xffffff001b28a460
sched_switch() at ... {truncated}
mi_switch() at ...
sleepq_switch() at ...
sleepq_wait() at ...
_cv_unit() ...
zio_wait() ...
arc_read_nolock() ...
zil_read_log_data() ...
zil_replay_log_record() ...
zil_parse() ...
zil_replay() ...
zfsvfs_setup() ...
zfs_mount() ...
vfs_domount() ...
nmount() ...
amd_64_syscall() ...
Xfast_syscall() ...
--- syscall (378, FreeBSD ELF64, nmount) ...

Haven't found any logged PRs that might match, and some hours of google-fu haven't led to enlightenment :-(

Would appreciate your thoughts on this
	a) prior to opening a new PR, and
	b) how to approach recovery (~4TB of data on this dataset, about 10% of the pool)

TIA for your effort,

R.