ZFS hangs very often

Mon Mar 31 06:22:56 PDT 2008

On Mon, 31 Mar 2008 13:51:07 +0200
Attila Nagy <bra at fsn.hu> wrote:

> Hello,
> 
> On my desktop machine I use a ZFS pool for everything but the swap and 
> the root fs (so /usr, /tmp, and everything else is on ZFS, swap and / is 
> on gmirror of two-two partitions).
> 
> The first setup was a FreeBSD/i386 7-STABLE, and the pool consisted of 
> two partitions from two SATA disks, which were encrypted with GELI 
> individually.
> 
> After using the system for some weeks (without any higher number of IO 
> activity, just working on the machine), the first hang came: I couldn't 
> move the mouse under X, but remote sessions were alive and also the 
> clock app still counted the time fine. I couldn't log into the machine 
> via ssh, the port was open, but I haven't got the banner.
> I've done a portupgrade at that time.
> 
> After this (and several other) hangs, I decided to remove GELI from the 
> equation without success. Then one partition (disk) instead of two. And 
> now, I am running amd64 instead of i386 and the problem still persists.
> 
> I've attached my notebook to the machine and here is what I have during 
> the hang (currently, I am in the process of upgrading some ports and now 
> a configure tries to run, but the machine has stopped):
> KDB: enter: manual escape to debugger
> [thread pid 23 tid 100022 ]
> Stopped at      kdb_enter+0x31: leave
> db> bt
> Tracing pid 23 tid 100022 td 0xffffff000127c350
> kdb_enter() at kdb_enter+0x31
> scgetc() at scgetc+0x461
> sckbdevent() at sckbdevent+0xa4
> kbdmux_intr() at kbdmux_intr+0x43
> kbdmux_kbd_intr() at kbdmux_kbd_intr+0x20
> taskqueue_run() at taskqueue_run+0x9f
> ithread_loop() at ithread_loop+0x180
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffffffb4f04d30, rbp = 0 ---
> db> ps
>   pid  ppid  pgrp   uid   state   wmesg         wchan        cmd
> 77873 77871 76757     0  S+      piperd   0xffffff0010036ba0 as
> 77872 77871 76757     0  S+      zfs:(&zi 0xffffff0020ba0298 cc1plus
> 77871 77870 76757     0  S+      wait     0xffffff0001446000 c++
> 77870 77321 76757     0  S+      wait     0xffffff000eba7468 sh
> 77321 76882 76757     0  S+      wait     0xffffff00396448d0 sh
> 76882 76881 76757     0  S+      wait     0xffffff000eba5468 sh
> 76881 76757 76757     0  S+      wait     0xffffff00014b28d0 sh
> 76757 76755 76757     0  Ss+     wait     0xffffff001b369000 make
> 76755 62725 62725     0  S+      select   0xffffffff80a89d50 script
> 62725   817 62725     0  S+      wait     0xffffff0001f43000 initial thread
> 86757 86674 86757  1001  SL+     pfault   0xffffffff80a9a79c ssh
> 86674 86672 86674  1001  Ss+     wait     0xffffff0001f428d0 bash
> 86672 86670 86670  1001  S       select   0xffffffff80a89d50 sshd
> 86670   721 86670     0  Ss      sbwait   0xffffff0001d2015c sshd
> 62788   802 62788     0  S+      ttyin    0xffffff00013bac10 csh
> 46310   801 46310     0  ?+                                  csh
>   817   800   817     0  S+      pause    0xffffff0001f420c0 csh
>   807     1   807     0  Ss+     ttyin    0xffffff000139e810 getty
>   806     1   806     0  Ss+     ttyin    0xffffff00013bb410 getty
>   805     1   805     0  Ss+     ttyin    0xffffff00013ba810 getty
>   804     1   804     0  Ss+     ttyin    0xffffff00013ba010 getty
>   803     1   803     0  Ss+     ttyin    0xffffff00013b9410 getty
>   802     1   802     0  Ss+     wait     0xffffff00014b08d0 login
>   801     1   801     0  Ss+     wait     0xffffff0001567468 login
>   800     1   800     0  Ss+     wait     0xffffff00014b0468 login
>   737     1   737     0  ?s                                  cron
>   731     1   731    25  Ss      pause    0xffffff00015650c0 sendmail
>   727     1   727     0  ?s                                  sendmail
>   721     1   721     0  Ss      select   0xffffffff80a89d50 sshd
>   688   687   687   123  ?                                   ntpd
>   687     1   687     0  Ss      select   0xffffffff80a89d50 ntpd
>   668     1   668     0  Ss      select   0xffffffff80a89d50 powerd
>   559     1   559     0  ?s                                  syslogd
>   488     1   488     0  Ss      select   0xffffffff80a89d50 devd
>   440     1   440     0  Ss      select   0xffffffff80a89d50 moused
>   248     1   248     0  Ss      pause    0xffffff00015640c0 adjkerntz
>   175     0     0     0  SL      zfs:(&tq 0xffffff0001583080 [zil_clean]
>   174     0     0     0  SL      zfs:(&tq 0xffffff00015831c0 [zil_clean]
>   173     0     0     0  SL      zfs:(&tq 0xffffff0001583300 [zil_clean]
>   172     0     0     0  SL      zfs:(&tq 0xffffff0001583440 [zil_clean]
>   171     0     0     0  SL      zfs:(&tq 0xffffff0001583580 [zil_clean]
>   170     0     0     0  SL      zfs:(&tq 0xffffff00015836c0 [zil_clean]
>   168     0     0     0  SL      zfs:(&tx 0xffffff000146c590 
> [txg_thread_enter]
>   167     0     0     0  SL      zfs:(&zi 0xffffff000c717d58 
> [txg_thread_enter]
>   166     0     0     0  SL      zfs:(&tx 0xffffff00014e0a40 
> [txg_thread_enter]
>   165     0     0     0  SL      vgeom:io 0xffffff000145c410 
> [vdev:worker ad0s1d]
>   164     0     0     0  SL      zfs:(&tq 0xffffff000158e300 
> [spa_zio_intr_5]
>   163     0     0     0  SL      zfs:(&tq 0xffffff000158e300 
> [spa_zio_intr_5]
>   162     0     0     0  SL      zfs:(&tq 0xffffff000158e1c0 
> [spa_zio_issue_5]
>   161     0     0     0  SL      zfs:(&tq 0xffffff000158e1c0 
> [spa_zio_issue_5]
>   160     0     0     0  SL      zfs:(&tq 0xffffff0001227d00 
> [spa_zio_intr_4]
>   159     0     0     0  SL      zfs:(&tq 0xffffff0001227d00 
> [spa_zio_intr_4]
>   158     0     0     0  SL      zfs:(&tq 0xffffff0001227bc0 
> [spa_zio_issue_4]
>   157     0     0     0  SL      zfs:(&tq 0xffffff0001227bc0 
> [spa_zio_issue_4]
>   156     0     0     0  SL      zfs:(&tq 0xffffff0001227a80 
> [spa_zio_intr_3]
>   155     0     0     0  SL      zfs:(&tq 0xffffff0001227a80 
> [spa_zio_intr_3]
>   154     0     0     0  SL      zfs:(&tq 0xffffff0001227940 
> [spa_zio_issue_3]
>   153     0     0     0  SL      zfs:(&tq 0xffffff0001227940 
> [spa_zio_issue_3]
>   152     0     0     0  SL      zfs:&vq- 0xffffff00015d0c88 
> [spa_zio_intr_2]
>   151     0     0     0  SL      vmwait   0xffffffff80a9a79c 
> [spa_zio_intr_2]
>   150     0     0     0  SL      vmwait   0xffffffff80a9a79c 
> [spa_zio_issue_2]
>   149     0     0     0  SL      vmwait   0xffffffff80a9a79c 
> [spa_zio_issue_2]
>   148     0     0     0  SL      zfs:(&tq 0xffffff0001227580 
> [spa_zio_intr_1]
>   147     0     0     0  SL      zfs:(&tq 0xffffff0001227580 
> [spa_zio_intr_1]
>   146     0     0     0  SL      zfs:(&tq 0xffffff0001227440 
> [spa_zio_issue_1]
>   145     0     0     0  SL      zfs:(&tq 0xffffff0001227440 
> [spa_zio_issue_1]
>   144     0     0     0  SL      zfs:(&tq 0xffffff00012271c0 
> [spa_zio_intr_0]
>   143     0     0     0  SL      zfs:(&tq 0xffffff00012271c0 
> [spa_zio_intr_0]
>   142     0     0     0  SL      zfs:(&tq 0xffffff0001227300 
> [spa_zio_issue_0]
>   141     0     0     0  SL      zfs:(&tq 0xffffff0001227300 
> [spa_zio_issue_0]
>    87     0     0     0  SL      vmwait   0xffffffff80a9a79c [g_eli[1] 
> mirror/swa]
>    86     0     0     0  SL      vmwait   0xffffffff80a9a79c [g_eli[0] 
> mirror/swa]
>    53     0     0     0  SL      sdflush  0xffffffff80a99d88 [softdepflush]
>    52     0     0     0  SL      vlruwt   0xffffff0001448000 [vnlru]
>    51     0     0     0  SL      zfs:&vq- 0xffffff00015d0c88 [syncer]
>    50     0     0     0  SL      psleep   0xffffffff80a8a55c [bufdaemon]
>    49     0     0     0  SL      pgzero   0xffffffff80a9b804 [pagezero]
>    48     0     0     0  SL      psleep   0xffffffff80a9ab48 [vmdaemon]
>    47     0     0     0  SL      wswbuf0  0xffffffff80a9a004 [pagedaemon]
>    46     0     0     0  SL      m:w1     0xffffff0001401200 [g_mirror swap]
>    45     0     0     0  SL      m:w1     0xffffff00013c3800 [g_mirror root]
>    44     0     0     0  SL      zfs:(&ar 0xffffffff80c746b0 
> [arc_reclaim_thread]
>    43     0     0     0  SL      waiting_ 0xffffffff80a8dc88 [sctp_iterator]
>    42     0     0     0  WL                                  [swi0: sio]
>    41     0     0     0  WL                                  [irq1: atkbd0]
>    40     0     0     0  WL                                  [irq15: ata1]
>    39     0     0     0  WL                                  [irq14: ata0]
>    38     0     0     0  SL      usbevt   0xffffff000133a420 [usb5]
>    37     0     0     0  SL      usbevt   0xffffffff81065420 [usb4]
>    36     0     0     0  SL      usbevt   0xffffffff81063420 [usb3]
>    35     0     0     0  SL      usbevt   0xffffff000130c420 [usb2]
>    34     0     0     0  WL                                  [irq22: ehci0]
>    33     0     0     0  SL      usbevt   0xffffffff81061420 [usb1]
>    32     0     0     0  WL                                  [irq21: 
> pcm0 uhci1+]
>    31     0     0     0  SL      usbtsk   0xffffffff80a71028 [usbtask-dr]
>    30     0     0     0  SL      usbtsk   0xffffffff80a71000 [usbtask-hc]
>    29     0     0     0  SL      usbevt   0xffffffff8105f420 [usb0]
>    28     0     0     0  WL                                  [irq20: 
> uhci0 uhci+]
>    27     0     0     0  SL      -        0xffffff00012ef880 [em0 taskq]
>    26     0     0     0  WL                                  [irq9: acpi0]
>    25     0     0     0  SL      -        0xffffff0001294580 [kqueue taskq]
>    24     0     0     0  WL                                  [swi6: task 
> queue]
>    23     0     0     0  RL      CPU 1                       [swi6: 
> Giant taskq]
>    22     0     0     0  SL      -        0xffffff000122c500 [thread taskq]
>    21     0     0     0  WL                                  [swi5: +]
>    20     0     0     0  SL      -        0xffffff000122ca80 [acpi_task_2]
>    19     0     0     0  SL      -        0xffffff000122ca80 [acpi_task_1]
>    18     0     0     0  SL      -        0xffffff000122ca80 [acpi_task_0]
>    17     0     0     0  WL                                  [swi2: cambio]
>     9     0     0     0  SL      ccb_scan 0xffffffff80a3fda0 [xpt_thrd]
>    16     0     0     0  SL      -        0xffffffff80a74ea8 [yarrow]
>     8     0     0     0  SL      crypto_r 0xffffffff80d293b0 [crypto 
> returns]
>     7     0     0     0  SL      crypto_w 0xffffffff80d29350 [crypto]
>     6     0     0     0  SL      zfs:(&tq 0xffffff0001227080 [system_taskq]
>     5     0     0     0  SL      zfs:(&tq 0xffffff0001227080 [system_taskq]
>     4     0     0     0  SL      -        0xffffffff80a71838 [g_down]
>     3     0     0     0  SL      -        0xffffffff80a71830 [g_up]
>     2     0     0     0  SL      -        0xffffffff80a71820 [g_event]
>    15     0     0     0  WL                                  [swi1: net]
>    14     0     0     0  WL                                  [swi3: vm]
>    13     0     0     0  LL     *Giant    0xffffff00015a2be0 [swi4: 
> clock sio]
>    12     0     0     0  RL      CPU 0                       [idle: cpu0]
>    11     0     0     0  RL                                  [idle: cpu1]
>     1     0     1     0  SLs     wait     0xffffff000112f8d0 [init]
>    10     0     0     0  SL      audit_wo 0xffffffff80a99260 [audit]
>     0     0     0     0  SLs     vmwait   0xffffffff80a9a79c [swapper]
> db> trace 77872
> Tracing pid 77872 tid 100157 td 0xffffff000eb9c350
> sched_switch() at sched_switch+0x1fe
> mi_switch() at mi_switch+0x189
> sleepq_wait() at sleepq_wait+0x3b
> _cv_wait() at _cv_wait+0xfe
> zio_wait() at zio_wait+0x5f
> dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x1f6
> dmu_buf_hold_array() at dmu_buf_hold_array+0x62
> dmu_read_uio() at dmu_read_uio+0x3f
> zfs_freebsd_read() at zfs_freebsd_read+0x535
> vn_read() at vn_read+0x1ca
> dofileread() at dofileread+0xa1
> kern_readv() at kern_readv+0x4c
> read() at read+0x54
> syscall() at syscall+0x254
> Xfast_syscall() at Xfast_syscall+0xab
> --- syscall (3, FreeBSD ELF64, read), rip = 0x8bd74c, rsp = 
> 0x7fffffffe2c8, rbp = 0 ---
> db> trace 51
> Tracing pid 51 tid 100050 td 0xffffff00014246a0
> sched_switch() at sched_switch+0x1fe
> mi_switch() at mi_switch+0x189
> sleepq_wait() at sleepq_wait+0x3b
> _sx_xlock_hard() at _sx_xlock_hard+0x1ee
> _sx_xlock() at _sx_xlock+0x4e
> vdev_queue_io() at vdev_queue_io+0x74
> vdev_geom_io_start() at vdev_geom_io_start+0x4a
> vdev_mirror_io_start() at vdev_mirror_io_start+0x1b0
> zil_lwb_write_start() at zil_lwb_write_start+0x2f1
> zil_commit_writer() at zil_commit_writer+0x1c4
> zil_commit() at zil_commit+0xb8
> zfs_sync() at zfs_sync+0x9a
> sync_fsync() at sync_fsync+0x1ac
> sched_sync() at sched_sync+0x63f
> fork_exit() at fork_exit+0x11f
> fork_trampoline() at fork_trampoline+0xe
> --- trap 0, rip = 0, rsp = 0xffffffffb502fd30, rbp = 0 ---
> 
> Any ideas about this?
>

I quote the entire email to preserve context, although that seems rather
excessive.

I can only say that I've observed hangs like this at the same location
in the kernel as in the first stack trace (_cv_wait -> sleepq_wait).

Since I don't have important file systems like /usr, /var etc. under ZFS
I've always been able to recover by
a) raising the priority of the blocked process with nice
b) then killing the process

Strangely enough I've always been able to access the file system on which
the process was blocked (e.g. ls) from a different terminal.  So the hang seems
to be limited to only the one process and not to be a symptom of ZFS itself
wedging.  Or maybe it's just that my ls was accessing different parts of the
filesystem not covered by the CV?  FIIK.

Otherwise I have no idea what's going on.

I mentioned this some time ago (months?) on -current but never got any
response.  I didn't have any nice trace, though.

---
Gary Jennejohn