From nobody Fri Jan 28 07:03:41 2022 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 670E3198A3A0; Fri, 28 Jan 2022 07:03:41 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4JlT2P2Rjpz4lqf; Fri, 28 Jan 2022 07:03:41 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1643353421; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=tpXMJVJXNhIOP50cwFGmoDByqfN/N+iAT5kv2GIwo/M=; b=xNEKBR170G9lLnmHVixkCCmjgZmmaIQov2CEkmj6Ce9bw4jeITdMUk2B0zgjKiqHdOR/2V vqJnzSGRVtgbTOvtOc2Vu5/84u/pVlI7XGlep8NFdHXqzni5OmgiPdc1h63WIssRrHjjXw MxsqjTXyJda6Od21izQqmreFQ4JrsbWxWUfsg2VOuM74WPXFKv5cihSuKh8e8Y+sqfhLKF rLiBqSsinvhPYABG3Y7dVt64MDGEAcicYQQXvAH3IRS/NK/SZo4FCHTvBpT7+5aZ2Oo4p/ UuhmWKfb+G1lHjUCT1fzZFBZrEmqAJZBvTL+BnOE2i+FQaxffTy67C5CkbSXYg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 3389E25D2C; Fri, 28 Jan 2022 07:03:41 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.16.1/8.16.1) with ESMTP id 20S73fPW078797; Fri, 28 Jan 2022 07:03:41 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.16.1/8.16.1/Submit) id 20S73f7j078796; Fri, 28 Jan 2022 07:03:41 GMT (envelope-from git) Date: Fri, 28 Jan 2022 07:03:41 GMT Message-Id: <202201280703.20S73f7j078796@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Kirk McKusick Subject: git: ddf162d1d15f - main - ufs: handle LoR between snap lock and vnode lock List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-dev-commits-src-all@freebsd.org X-BeenThere: dev-commits-src-all@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mckusick X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: ddf162d1d15f63e871fa1e44334c9461772b7f7a Auto-Submitted: auto-generated ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1643353421; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=tpXMJVJXNhIOP50cwFGmoDByqfN/N+iAT5kv2GIwo/M=; b=qgRHvKMVSJbI8nBAmCemXACdbjy3L4kFnlAWqjvnVy6/aaQD+t0xaoHil2wmgJE9peNcvq g0It7HTEM+FjCiAF5gwSh2m2HizE5BAwuef1rKf8fWaYd1M2K6sjRg0WWXs6dslO/CkNdb fd0eGRckifZJGcpn7hW1LpeEdoPlDlhIIqtP/cTMelhNgO8kvgLcfK9nGML4rttFghLDKZ eiVlxdcAy+wvFg8NZZANut5QmpNXm3RJavjXaRnb8n+mRZC/0qFSA87LNlE16GnVd9wHMj R0YUIZBIefwBe8kJwCPxqRdfBAwwAS9jk733SfYaGr2jAFHP8045C8Pwc7h66w== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1643353421; a=rsa-sha256; cv=none; b=Dg1TCUq2TEzjWlKlYvjfuAEyRs7BwelDPmeKJNII3ckBx3FdB+SuHf8POQobQ617VqBukH 1w/+0dIPkuD0byIFeXbLbaxeSZ9fcw6edqwpmTBc5sfSrLY1rpAYeY2kRYKqE/jmfwuW6Q FAuqmXonnpgoGB/ZLWs2/jPNQGac1tozG+KSbUbd1oM3T5veoOfl08jNvnL/DyvmGIVCQP ZpPdLL2y3zAEF9KlpnWA1qZY2KX5ZdbWts2BSfdjT5rpLU3IO+KG5rr/aXRGpyz0w1GsYF 3qCI97ctAIaZjPoQ42qEaimmweiBy1u7y2V3b82DBL5/NC1ctdVfCdkIK3Obeg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none X-ThisMailContainsUnwantedMimeParts: N The branch main has been updated by mckusick: URL: https://cgit.FreeBSD.org/src/commit/?id=ddf162d1d15f63e871fa1e44334c9461772b7f7a commit ddf162d1d15f63e871fa1e44334c9461772b7f7a Author: Kirk McKusick AuthorDate: 2022-01-28 07:00:51 +0000 Commit: Kirk McKusick CommitDate: 2022-01-28 07:03:35 +0000 ufs: handle LoR between snap lock and vnode lock When a filesystem is mounted all of its associated snapshots must be activated. It first allocates a snapshot lock (snaplk) that will be shared by all the snapshot vnodes associated with the filesystem. As part of each snapshot file activation, it must replace its own ufs vnode lock with the snaplk. In this way acquiring the snaplk gives exclusive access to all the snapshots for the filesystem. A write to a ufs vnode first acquires the ufs vnode lock for the file to be written then acquires the snaplk. Once it has the snaplk, it can check all the snapshots to see if any of them needs to make a copy of the block that is about to be written. This ffs_copyonwrite() code path establishes the ufs vnode followed by snaplk locking order. When a filesystem is unmounted it has to release all of its snapshot vnodes. Part of doing the release is to revert the snapshot vnode from using the snaplk to using its original vnode lock. While holding the snaplk, the vnode lock has to be acquired, the vnode updated to reference it, then the snaplk released. Acquiring the vnode lock while holding the snaplk violates the ufs vnode then snaplk order. Because the vnode lock is unused, using LK_EXCLUSIVE | LK_NOWAIT to acquire it will always succeed and the LK_NOWAIT prevents the reverse lock order from being recorded. This change was made in January 2021 (173779b98f) to avoid an LOR violation in ffs_snapshot_unmount(). The same LOR issue was recently found again when removing a snapshot in ffs_snapremove() which must also revert the snaplk to the original vnode lock as part of freeing it. The unwind in ffs_snapremove() deals with the case in which the snaplk is held as a recursive lock holding multiple references. Specifically an equal number of references are made on the vnode lock. This change factors out the lock reversion operations into a new function revert_snaplock() which handles both the recursive locks and avoids the LOR. The new revert_snaplock() function is then used in both ffs_snapshot_unmount() and in ffs_snapremove(). Reviewed by: kib Tested by: Peter Holm MFC after: 2 weeks Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D33946 --- sys/ufs/ffs/ffs_snapshot.c | 86 ++++++++++++++++++++++++++++------------------ 1 file changed, 53 insertions(+), 33 deletions(-) diff --git a/sys/ufs/ffs/ffs_snapshot.c b/sys/ufs/ffs/ffs_snapshot.c index c5063bf5f747..1257ee8f8f4f 100644 --- a/sys/ufs/ffs/ffs_snapshot.c +++ b/sys/ufs/ffs/ffs_snapshot.c @@ -175,6 +175,7 @@ static int mapacct_ufs2(struct vnode *, ufs2_daddr_t *, ufs2_daddr_t *, struct fs *, ufs_lbn_t, int); static int readblock(struct vnode *vp, struct buf *, ufs2_daddr_t); static void try_free_snapdata(struct vnode *devvp); +static void revert_snaplock(struct vnode *, struct vnode *, struct snapdata *); static struct snapdata *ffs_snapdata_acquire(struct vnode *devvp); static int ffs_bp_snapblk(struct vnode *, struct buf *); @@ -1651,7 +1652,7 @@ ffs_snapremove(vp) struct buf *ibp; struct fs *fs; ufs2_daddr_t numblks, blkno, dblk; - int error, i, last, loc; + int error, last, loc; struct snapdata *sn; ip = VTOI(vp); @@ -1669,20 +1670,10 @@ ffs_snapremove(vp) sn = devvp->v_rdev->si_snapdata; TAILQ_REMOVE(&sn->sn_head, ip, i_nextsnap); ip->i_nextsnap.tqe_prev = 0; - VI_UNLOCK(devvp); - lockmgr(&vp->v_lock, LK_EXCLUSIVE, NULL); - for (i = 0; i < sn->sn_lock.lk_recurse; i++) - lockmgr(&vp->v_lock, LK_EXCLUSIVE, NULL); - KASSERT(vp->v_vnlock == &sn->sn_lock, - ("ffs_snapremove: lost lock mutation")); - vp->v_vnlock = &vp->v_lock; - VI_LOCK(devvp); - while (sn->sn_lock.lk_recurse > 0) - lockmgr(&sn->sn_lock, LK_RELEASE, NULL); - lockmgr(&sn->sn_lock, LK_RELEASE, NULL); + revert_snaplock(vp, devvp, sn); try_free_snapdata(devvp); - } else - VI_UNLOCK(devvp); + } + VI_UNLOCK(devvp); /* * Clear all BLK_NOCOPY fields. Pass any block claims to other * snapshots that want them (see ffs_snapblkfree below). @@ -2152,27 +2143,18 @@ ffs_snapshot_unmount(mp) xp->i_nextsnap.tqe_prev = 0; lockmgr(&sn->sn_lock, LK_INTERLOCK | LK_EXCLUSIVE, VI_MTX(devvp)); - /* - * Avoid LOR with above snapshot lock. The LK_NOWAIT should - * never fail as the lock is currently unused. Rather than - * panic, we recover by doing the blocking lock. - */ - if (lockmgr(&vp->v_lock, LK_EXCLUSIVE | LK_NOWAIT, NULL) != 0) { - printf("ffs_snapshot_unmount: Unexpected LK_NOWAIT " - "failure\n"); - lockmgr(&vp->v_lock, LK_EXCLUSIVE, NULL); - } - KASSERT(vp->v_vnlock == &sn->sn_lock, - ("ffs_snapshot_unmount: lost lock mutation")); - vp->v_vnlock = &vp->v_lock; + VI_LOCK(devvp); + revert_snaplock(vp, devvp, sn); lockmgr(&vp->v_lock, LK_RELEASE, NULL); - lockmgr(&sn->sn_lock, LK_RELEASE, NULL); - if (xp->i_effnlink > 0) + if (xp->i_effnlink > 0) { + VI_UNLOCK(devvp); vrele(vp); - VI_LOCK(devvp); + VI_LOCK(devvp); + } sn = devvp->v_rdev->si_snapdata; } try_free_snapdata(devvp); + VI_UNLOCK(devvp); } /* @@ -2676,10 +2658,8 @@ try_free_snapdata(struct vnode *devvp) sn = devvp->v_rdev->si_snapdata; if (sn == NULL || TAILQ_FIRST(&sn->sn_head) != NULL || - (devvp->v_vflag & VV_COPYONWRITE) == 0) { - VI_UNLOCK(devvp); + (devvp->v_vflag & VV_COPYONWRITE) == 0) return; - } devvp->v_rdev->si_snapdata = NULL; devvp->v_vflag &= ~VV_COPYONWRITE; @@ -2691,6 +2671,46 @@ try_free_snapdata(struct vnode *devvp) if (snapblklist != NULL) free(snapblklist, M_UFSMNT); ffs_snapdata_free(sn); + VI_LOCK(devvp); +} + +/* + * Revert a vnode lock from using the snapshot lock back to its own lock. + * + * Aquire a lock on the vnode's own lock and release the lock on the + * snapshot lock. If there are any recursions on the snapshot lock + * get the same number of recursions on the vnode's own lock. + */ +static void +revert_snaplock(vp, devvp, sn) + struct vnode *vp; + struct vnode *devvp; + struct snapdata *sn; +{ + int i; + + ASSERT_VI_LOCKED(devvp, "revert_snaplock"); + /* + * Avoid LOR with snapshot lock. The LK_NOWAIT should + * never fail as the lock is currently unused. Rather than + * panic, we recover by doing the blocking lock. + */ + for (i = 0; i <= sn->sn_lock.lk_recurse; i++) { + if (lockmgr(&vp->v_lock, LK_EXCLUSIVE | LK_NOWAIT | + LK_INTERLOCK, VI_MTX(devvp)) != 0) { + printf("revert_snaplock: Unexpected LK_NOWAIT " + "failure\n"); + lockmgr(&vp->v_lock, LK_EXCLUSIVE | LK_INTERLOCK, + VI_MTX(devvp)); + } + VI_LOCK(devvp); + } + KASSERT(vp->v_vnlock == &sn->sn_lock, + ("revert_snaplock: lost lock mutation")); + vp->v_vnlock = &vp->v_lock; + while (sn->sn_lock.lk_recurse > 0) + lockmgr(&sn->sn_lock, LK_RELEASE, NULL); + lockmgr(&sn->sn_lock, LK_RELEASE, NULL); } static struct snapdata *