Re: git: 705a6ee2b611 - main - zfs: Fix a deadlock between page busy and the teardown lock
Date: Sat, 20 Nov 2021 16:37:45 UTC
On Sat, Nov 20, 2021 at 04:36:47PM +0000, Mark Johnston wrote: > The branch main has been updated by markj: > > URL: https://cgit.FreeBSD.org/src/commit/?id=705a6ee2b6112c3a653b2bd68f961a8b5b8071a4 > > commit 705a6ee2b6112c3a653b2bd68f961a8b5b8071a4 > Author: Mark Johnston <markj@FreeBSD.org> > AuthorDate: 2021-11-20 16:21:25 +0000 > Commit: Mark Johnston <markj@FreeBSD.org> > CommitDate: 2021-11-20 16:21:25 +0000 > > zfs: Fix a deadlock between page busy and the teardown lock > > When rolling back a dataset, ZFS has to purge file data resident in the > system page cache. To do this, it loops over all vnodes for the > mountpoint and calls vn_pages_remove() to purge pages associated with > the vnode's VM object. Each page is thus exclusively busied while the > dataset's teardown write lock is held. > > When handling a page fault on a mapped ZFS file, FreeBSD's page fault > handler busies newly allocated pages and then uses VOP_GETPAGES to fill > them. The ZFS getpages VOP acquires the teardown read lock with vnode > pages already busied. This represents a lock order reversal which can > lead to deadlock. > > To break the deadlock, observe that zfs_rezget() need only purge those > pages marked valid, and that pages busied by the page fault handler are, > by definition, invalid. Furthermore, ZFS pages always transition from > invalid to valid with the teardown lock held, and ZFS never creates > partially valid pages. Thus, zfs_rezget() can use the new > vn_pages_remove_valid() to skip over pages busied by the fault handler. I'll get this merged upstream once the new KPI is available in stable/13, but didn't see a reason to hold off committing directly.