Hour-long sleeps in the ZFS write throttle: fix for 13.1 ?
Date: Tue, 05 Apr 2022 21:05:25 UTC
All year long I've occasionally seen my ZFS processes get blocked in dmu_tx_wait. They stay blocked for more than an hour but eventually recover. I finally found the cause: an integer overflow bug in ustosbt. The fix is simple enough, but my question is: should we try to commit this in time for 13.1-RELEASE? It's a very disruptive bug, but also very hard to trigger. It takes a pretty highly congested ZFS system to trigger it. In theory the bug could affect other subsystems, too. https://github.com/openzfs/zfs/issues/13289 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263073