ZFS hang

Matt Burke mattblists at icritical.com
Fri Dec 7 11:00:20 UTC 2012


Hi.

I've just experienced a filesystem hang on 9.1-PRE from Sept 17th
while hitting the disks hard (io-bound database workload).

The disks are:
  16 Spinning SAS disks (ZFS vdevs), hung off a pair of Areca 1320 HBAs
  4 SATA SSDs (L2ARC), hung off onboard ICH (SATA3) and onboard C600
  2 SATA SSDs (gmirror swap, zfs root fs (mirrored), main array ZIL)
    connected the same as the other SSDs

Symptoms are any filesystem activity (login, starting gstat, tab
completion) is hanging, with processes either in state 'zfs' or
'zio->io_cv)'. CPUs are idle.

'zpool status -v' doesn't hang. All disks online, no errors.

In the debugger, I got the following:

  118     0     0     0  DL      ipmireq  0xfffffe0038bfcbb8 [ipmi0: kcs]
  108     0     0     0  DL      crypto_r 0xffffffff81a31900 [crypto returns]
  107     0     0     0  DL      crypto_w 0xffffffff81a318c0 [crypto]
   19     0     0     0  DL      m:w1     0xfffffe0038a79400 [g_mirror swap]
   18     0     0     0  DL      sdflush  0xffffffff81337c58 [softdepflush]
   17     0     0     0  DL      zio->io_ 0xfffffe07d9633d70 [syncer]
   16     0     0     0  DL      vlruwt   0xfffffe00389d1000 [vnlru]
    9     0     0     0  DL      psleep   0xffffffff8132bf48 [bufdaemon]
    8     0     0     0  DL      pgzero   0xffffffff81341c9c [pagezero]
    7     0     0     0  DL      psleep   0xffffffff81340e68 [vmdaemon]
    6     0     0     0  DL      psleep   0xffffffff81340e2c [pagedaemon]
    5     0     0     0  DL      ccb_scan 0xffffffff812988e0 [xpt_thrd]
    4     0     0     0  DL      waiting_ 0xffffffff81331f60 [sctp_iterator]
    3     0     0     0  DL      (threaded)                  [zfskern]
100740                   D       zio->io_ 0xfffffe1f3e786a00 [txg_thread_enter]
100739                   D       tx->tx_q 0xfffffe0038cd0230 [txg_thread_enter]
100409                   D       zio->io_ 0xfffffe1f40f02a00 [txg_thread_enter]
100408                   D       tx->tx_q 0xfffffe0038d43e30 [txg_thread_enter]
100097                   D       zio->io_ 0xfffffe1e6412da00 [l2arc_feed_thread]
100096                   D       arc_recl 0xffffffff817d4ca0 [arc_reclaim_thread]
   15     0     0     0  DL      (threaded)                  [usb]
100083                   D       -        0xffffff8001f5de18 [usbus1]
100082                   D       -        0xffffff8001f5ddc0 [usbus1]
100081                   D       -        0xffffff8001f5dd68 [usbus1]
100080                   D       -        0xffffff8001f5dd10 [usbus1]
100078                   D       -        0xffffff8001f54e18 [usbus0]
100077                   D       -        0xffffff8001f54dc0 [usbus0]
100076                   D       -        0xffffff8001f54d68 [usbus0]
100075                   D       -        0xffffff8001f54d10 [usbus0]
    2     0     0     0  DL      ctl_work 0xffffff8000a6a000 [ctl_thrd]
   14     0     0     0  DL      -        0xffffffff812dd7c4 [yarrow]
   13     0     0     0  DL      (threaded)                  [geom]
100023                   D       -        0xffffffff812d6b90 [g_down]
100022                   D       -        0xffffffff812d6b88 [g_up]
100021                   D       -        0xffffffff812d6b78 [g_event]
   12     0     0     0  WL      (threaded)                  [intr]

Everything else looked 'normal' (piles of stuck cron/sendmail processes,
etc)


66608 66605 66608     0  DVs     zfs      0xfffffe005922c638 cron
db> bt 66608
Tracing pid 66608 tid 108987 td 0xfffffe164ac67000
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
__lockmgr_args() at __lockmgr_args+0x610
vop_stdlock() at vop_stdlock+0x39
VOP_LOCK1_APV() at VOP_LOCK1_APV+0x52
_vn_lock() at _vn_lock+0x47
vget() at vget+0x70
cache_lookup_times() at cache_lookup_times+0x5f5
vfs_cache_lookup() at vfs_cache_lookup+0xc8
VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0x4c
lookup() at lookup+0x424   
namei() at namei+0x535
vn_open_cred() at vn_open_cred+0x3cb
kern_openat() at kern_openat+0x1f9
amd64_syscall() at amd64_syscall+0x5ea
Xfast_syscall() at Xfast_syscall+0xf7
--- syscall (5, FreeBSD ELF64, sys_open), rip = 0x800d3b0ac, rsp =
--- 0x7fffffffc878, rbp = 0x7fffffffc9c0 ---

(gstat, attempted to run after hang)
66578 11191 66578     0  D+      zio->io_ 0xfffffe1f3a16ad70 gstat
Tracing pid 66578 tid 109053 td 0xfffffe07e012e000
sched_switch() at sched_switch+0x13d 
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
zio_wait() at zio_wait+0x61
dbuf_read() at dbuf_read+0x5e5
dbuf_findbp() at dbuf_findbp+0x107
dbuf_hold_impl() at dbuf_hold_impl+0xc2
dbuf_hold() at dbuf_hold+0x1b
dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x147
dmu_buf_hold_array() at dmu_buf_hold_array+0x67
dmu_read_uio() at dmu_read_uio+0x3f
zfs_freebsd_read() at zfs_freebsd_read+0x3e3
VOP_READ_APV() at VOP_READ_APV+0x44
vn_read() at vn_read+0x316
dofileread() at dofileread+0xa1
kern_readv() at kern_readv+0x6c
sys_read() at sys_read+0x64
amd64_syscall() at amd64_syscall+0x5ea
Xfast_syscall() at Xfast_syscall+0xf7
--- syscall (3, FreeBSD ELF64, sys_read), rip = 0x8017d1b4c, rsp =
--- 0x7fffffffc488, rbp = 0x7fffffffc5d0 ---

   17     0     0     0  DL      zio->io_ 0xfffffe07d9633d70 [syncer]
Tracing pid 17 tid 100105 td 0xfffffe00389ba000
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196 
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
zio_wait() at zio_wait+0x61
dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x22b
dmu_read() at dmu_read+0x89
space_map_load() at space_map_load+0x108
metaslab_activate() at metaslab_activate+0xdc
metaslab_alloc() at metaslab_alloc+0x7b2
zio_alloc_zil() at zio_alloc_zil+0x61
zil_lwb_write_start() at zil_lwb_write_start+0x1ad
zil_commit() at zil_commit+0x24a
zfs_sync() at zfs_sync+0xb3
sync_fsync() at sync_fsync+0x19d
VOP_FSYNC_APV() at VOP_FSYNC_APV+0x4a
sync_vnode() at sync_vnode+0x15e
sched_sync() at sched_sync+0x1c5
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa3008acbb0, rbp = 0 ---

    3     0     0     0  DL      (threaded)                  [zfskern]
Tracing pid 3 tid 100740 td 0xfffffe0038c858e0
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
zio_wait() at zio_wait+0x61
dsl_pool_sync() at dsl_pool_sync+0xe0
spa_sync() at spa_sync+0x336
txg_sync_thread() at txg_sync_thread+0x139
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa303242bb0, rbp = 0 ---

100740                   D       zio->io_ 0xfffffe1f3e786a00 [txg_thread_enter]
Tracing pid 3 tid 100740 td 0xfffffe0038c858e0
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
zio_wait() at zio_wait+0x61
dsl_pool_sync() at dsl_pool_sync+0xe0
spa_sync() at spa_sync+0x336
txg_sync_thread() at txg_sync_thread+0x139
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa303242bb0, rbp = 0 ---

100739                   D       tx->tx_q 0xfffffe0038cd0230 [txg_thread_enter]
Tracing pid 3 tid 100739 td 0xfffffe0038c86000
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
txg_thread_wait() at txg_thread_wait+0x79
txg_quiesce_thread() at txg_quiesce_thread+0xb5
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa303239bb0, rbp = 0 ---

100409                   D       zio->io_ 0xfffffe1f40f02a00 [txg_thread_enter]
db> bt 100409
Tracing pid 3 tid 100409 td 0xfffffe00389b9000
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
zio_wait() at zio_wait+0x61
dsl_pool_sync() at dsl_pool_sync+0xe0
spa_sync() at spa_sync+0x336
txg_sync_thread() at txg_sync_thread+0x139
fork_exit() at fork_exit+0x11f  
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa302eb5bb0, rbp = 0 ---

100408                   D       tx->tx_q 0xfffffe0038d43e30 [txg_thread_enter]
Tracing pid 3 tid 100408 td 0xfffffe0038c778e0
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
txg_thread_wait() at txg_thread_wait+0x79
txg_quiesce_thread() at txg_quiesce_thread+0xb5
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa302eacbb0, rbp = 0 ---

100097                   D       zio->io_ 0xfffffe1e6412da00 [l2arc_feed_thread]
Tracing pid 3 tid 100097 td 0xfffffe0036c4d000
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_wait() at sleepq_wait+0x42
_cv_wait() at _cv_wait+0x121
zio_wait() at zio_wait+0x61
l2arc_feed_thread() at l2arc_feed_thread+0x981
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa30075fbb0, rbp = 0 ---

100096                   D       arc_recl 0xffffffff817d4ca0 [arc_reclaim_thread]
Tracing pid 3 tid 100096 td 0xfffffe0036c4d470 
sched_switch() at sched_switch+0x13d
mi_switch() at mi_switch+0x196
sleepq_timedwait() at sleepq_timedwait+0x42
_cv_timedwait() at _cv_timedwait+0x13c
arc_reclaim_thread() at arc_reclaim_thread+0x29d
fork_exit() at fork_exit+0x11f
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xffffffa30075abb0, rbp = 0 ---


Does anyone either recognise what's going on (hopefully be able to
say this was fixed in r....), or could someone give me an indication
of how better to diagnose this if it happens again?

I can set a dumpdev, but this machine's got 128G RAM and precious 
little free memory (DB server), so I doubt anyone would want to
touch it!

Thanks

-- 
Sorry for the below...
The information contained in this message is confidential and intended for the addressee only. If you have received this message in error, or there are any problems with its content, please contact the sender. 

iCritical is a trading name of Critical Software Ltd. Registered in England: 04909220.
Registered Office: IC2, Keele Science Park, Keele, Staffordshire, ST5 5NH.

This message has been scanned for security threats by iCritical. www.icritical.com



More information about the freebsd-fs mailing list