Re: Hour-long sleeps in the ZFS write throttle: fix for 13.1 ?
- In reply to: Alan Somers : "Hour-long sleeps in the ZFS write throttle: fix for 13.1 ?"
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Wed, 06 Apr 2022 00:24:40 UTC
On Tue, Apr 5, 2022 at 3:06 PM Alan Somers <asomers@freebsd.org> wrote: > All year long I've occasionally seen my ZFS processes get blocked in > dmu_tx_wait. They stay blocked for more than an hour but eventually > recover. I finally found the cause: an integer overflow bug in > ustosbt. The fix is simple enough, but my question is: should we try > to commit this in time for 13.1-RELEASE? It's a very disruptive bug, > but also very hard to trigger. It takes a pretty highly congested ZFS > system to trigger it. In theory the bug could affect other > subsystems, too. > > https://github.com/openzfs/zfs/issues/13289 > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=263073 These routines were originally not meant for large times (> 1s). However, that was poorly documented and so I fixed it. But did so incorrectly. If you look at the bug, I've posted what I think is the fix (it also matches Alan's description). Warner