[Bug 227784] zfs: Fatal trap 9: general protection fault while in kernel mode on shutdown
bugzilla-noreply at freebsd.org
bugzilla-noreply at freebsd.org
Fri Oct 19 22:22:36 UTC 2018
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227784
Mark Johnston <markj at FreeBSD.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |allanjude at FreeBSD.org,
| |mav at FreeBSD.org
--- Comment #15 from Mark Johnston <markj at FreeBSD.org> ---
I took at a look at a vmcore provided by wulf at . At the time of the panic, the
kernel was waiting for MOS dnode dbuf evictions to finsh:
(kgdb) bt
#0 sched_switch (td=0xfffff800035d3000, newtd=0xfffff800035d2580,
flags=<optimized out>) at /usr/src/sys/kern/sched_ule.c:2112
#1 0xffffffff806a759f in mi_switch (flags=260, newtd=0x0) at
/usr/src/sys/kern/kern_synch.c:439
#2 0xffffffff806f0d8d in sleepq_switch (wchan=0xfffffe008dffe390, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:613
#3 0xffffffff806f0c33 in sleepq_wait (wchan=0xfffffe008dffe390, pri=0) at
/usr/src/sys/kern/subr_sleepqueue.c:692
#4 0xffffffff806381f3 in _cv_wait (cvp=0xfffffe008dffe390, lock=<optimized
out>) at /usr/src/sys/kern/kern_condvar.c:146
#5 0xffffffff8039d5db in spa_evicting_os_wait (spa=<optimized out>)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:1959
#6 0xffffffff8038ad9b in spa_deactivate (spa=0xfffffe008dffe000) at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1272
#7 0xffffffff80393b88 in spa_evict_all () at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:8350
#8 0xffffffff8039dade in spa_fini () at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c:2141
#9 0xffffffff803e6bdc in zfs__fini () at
/usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:7109
#10 0xffffffff8069bf86 in kern_reboot (howto=16392) at
/usr/src/sys/kern/kern_shutdown.c:443
#11 0xffffffff8069bb4a in sys_reboot (td=<optimized out>,
uap=0xfffff800035d33c0) at /usr/src/sys/kern/kern_shutdown.c:280
At this point, the spa_unload() call preceding the spa_deactivate() call had
already freed the pool. However, dsl_pool_close() calls
dmu_buf_user_evict_wait() after kicking off evictions of top-level directories:
452 /*
453 * Drop our references from dsl_pool_open().
454 *
455 * Since we held the origin_sintnap from "syncing" context (which
456 * includes pool-opening context), it actually only got a "ref"
457 * and not a hold, so just drop that here.
458 */
459 if (dp->dp_origin_snap != NULL)
460 dsl_dataset_rele(dp->dp_origin_snap, dp);
461 if (dp->dp_mos_dir != NULL)
462 dsl_dir_rele(dp->dp_mos_dir, dp);
463 if (dp->dp_free_dir != NULL)
464 dsl_dir_rele(dp->dp_free_dir, dp);
465 if (dp->dp_leak_dir != NULL)
466 dsl_dir_rele(dp->dp_leak_dir, dp);
467 if (dp->dp_root_dir != NULL)
468 dsl_dir_rele(dp->dp_root_dir, dp);
...
496 dmu_buf_user_evict_wait();
Looking a bit at the dbuf:
(kgdb) frame 12
#12 0xffffffff8036221c in dsl_dir_evict_async (dbu=0xfffff800053da400)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c:158
158 spa_async_close(dd->dd_pool->dp_spa, dd);
(kgdb) p dd->dd_myname
$42 = "$ORIGIN", '\000' <repeats 248 times>
(kgdb) p dd->dd_parent->dd_myname
$43 = "u01", '\000' <repeats 252 times>
I'm not sure what $ORIGIN is; I guess it's some ZFS metadata.
I looked at taskq_wait() in FreeBSD vs. illumos. On FreeBSD it will only
wait for currently queued tasks to finish; anything enqueued after the drain
starts may not be finished by the time we return. On illumos it looks like
taskq_wait() will wait until the queue is completely empty. So, if the async
evictions queue some additional evictions, on FreeBSD we won't recursively
wait,
and the taskq_wait() will return early. I can't tell if ZFS is making this
assumption though.
--
You are receiving this mail because:
You are the assignee for the bug.
More information about the freebsd-fs
mailing list