Re: Possible bug in zfs send or pipe implementation?

From: Rick Macklem <rick.macklem_at_gmail.com>
Date: Sun, 14 Jul 2024 02:42:32 UTC
On Sat, Jul 13, 2024 at 7:02 PM Garrett Wollman <wollman@bimajority.org> wrote:
>
> I'm migrating an old file server to new hardware using syncoid.  Every
> so often, the `zfs send` process gets stuck with the following
> kstacks:
>
>  7960 108449 zfs                 -                   mi_switch sleepq_catch_signals sleepq_wait_sig _sleep pipe_write zfs_file_write_impl zfs_file_write dump_record dmu_dump_write do_dump dmu_send_impl dmu_send_obj zfs_ioc_send zfsdev_ioctl_common zfsdev_ioctl devfs_ioctl vn_ioctl devfs_ioctl_f
>  7960 126072 zfs                 send_traverse_threa mi_switch sleepq_catch_signals sleepq_wait_sig _cv_wait_sig bqueue_enqueue_impl send_cb traverse_visitbp traverse_visitbp traverse_visitbp traverse_dnode traverse_visitbp traverse_visitbp traverse_visitbp traverse_visitbp traverse_visitbp traverse_visitbp traverse_dnode traverse_visitbp
>  7960 126074 zfs                 send_merge_thread   mi_switch sleepq_catch_signals sleepq_wait_sig _cv_wait_sig bqueue_enqueue_impl send_merge_thread fork_exit fork_trampoline
>  7960 126075 zfs                 send_reader_thread  mi_switch sleepq_catch_signals sleepq_wait_sig _cv_wait_sig bqueue_enqueue_impl send_reader_thread fork_exit fork_trampoline
>
> Near as I can tell, the thread first thread is trying to write
> serialized data data to the output pipe and is blocked.  The other
> threads are stuck because the write process isn't making progress.
# ps axHl
should show you what wchan's the processes are waiting on and that might
give you a clue w.r.t. what is happening?

If is easy to build a kernel from sources and boot that, you could try defining
PIPE_NODIRECT in sys/kern/sys_pipe.c and see if that avoids the hangs?

rick

>
> The process reading from the pipe (which is just a progress meter) is
> sitting in select() waiting for the pipe to become ready, so either
> zfs_file_write() is doing something wrong, or the pipe implementation
> has lost a selwakeup() somewhere.  (Or, possibly but unlikely, the
> progress meter has lost the read end of the pipe from its read
> fd_set.)  Unfortunately, neither fstat nor procstat print any useful
> information about the state of the pipe, so I can only try to deduce
> what's going on from the observable behavior.
>
> -GAWollman
>