[Bug 282169] zfs rename deadlock with mountd, df & fstat (and possibly others)

From: <bugzilla-noreply_at_freebsd.org>
Date: Fri, 18 Oct 2024 06:25:09 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=282169

            Bug ID: 282169
           Summary: zfs rename deadlock with mountd, df & fstat (and
                    possibly others)
           Product: Base System
           Version: 13.4-RELEASE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Some People
          Priority: ---
         Component: kern
          Assignee: bugs@FreeBSD.org
          Reporter: pen@lysator.liu.se

Created attachment 254323
  --> https://bugs.freebsd.org/bugzilla/attachment.cgi?id=254323&action=edit
Output from "procstat -kk -a" at the time of the deadlock

Ran into a deadlock involving doing "zfs rename" on a large number of
filesystems on one of our production servers a couple of days ago. 

I was renaming them from DATA/{staff,students}/<user> to
DATA/archive/{staff,students}/<user>.

While running that script the first 167 (out of about 20000) filesystems worked
fine, and then it deadlocked - mountd stopped servicing new requests, a "df -it
zfs" never finished and same with an "fuser" command.

(I run another script from cron every minute that logs how the system looks by
saving the output from "df -it zfs", "fuser", "procstat -kk -a" and a bunch of
other commands, that script also stopped working at the time of the deadlock).


Looking at the output from "procstat -kk -a" (included) it seems the hanging
processes were blocked with some ZFS locks. 

I found an old bug report from around 2016 (209158) where something similar is
discussed, but that was with the old FreeBSD-ZFS code.

I eventually had to hard-reset the server since it never recovered (atleast not
in the 5 hours I waited).

-- 
You are receiving this mail because:
You are the assignee for the bug.