From nobody Thu Oct 28 16:03:38 2021
X-Original-To: fs@mlmmj.nyi.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
	by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 39364181EB39
	for <fs@mlmmj.nyi.freebsd.org>; Thu, 28 Oct 2021 16:03:38 +0000 (UTC)
	(envelope-from bugzilla-noreply@freebsd.org)
Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256
	 client-signature RSA-PSS (4096 bits) client-digest SHA256)
	(Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK))
	by mx1.freebsd.org (Postfix) with ESMTPS id 4Hg9Mt138jz4pP8
	for <fs@FreeBSD.org>; Thu, 28 Oct 2021 16:03:38 +0000 (UTC)
	(envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256)
	(Client did not present a certificate)
	by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id EDD8D26A04
	for <fs@FreeBSD.org>; Thu, 28 Oct 2021 16:03:37 +0000 (UTC)
	(envelope-from bugzilla-noreply@freebsd.org)
Received: from kenobi.freebsd.org ([127.0.1.5])
	by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 19SG3bUi022918
	for <fs@FreeBSD.org>; Thu, 28 Oct 2021 16:03:37 GMT
	(envelope-from bugzilla-noreply@freebsd.org)
Received: (from www@localhost)
	by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 19SG3b1v022917
	for fs@FreeBSD.org; Thu, 28 Oct 2021 16:03:37 GMT
	(envelope-from bugzilla-noreply@freebsd.org)
X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f
From: bugzilla-noreply@freebsd.org
To: fs@FreeBSD.org
Subject: [Bug 258208] [zfs] locks up when using rollback or destroy on both
 13.0-RELEASE & sysutils/openzfs port
Date: Thu, 28 Oct 2021 16:03:38 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: Base System
X-Bugzilla-Component: bin
X-Bugzilla-Version: 13.0-RELEASE
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: Affects Only Me
X-Bugzilla-Who: markj@FreeBSD.org
X-Bugzilla-Status: Open
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: ---
X-Bugzilla-Assigned-To: fs@FreeBSD.org
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-258208-3630-E1upywKjtH@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-258208-3630@https.bugs.freebsd.org/bugzilla/>
References: <bug-258208-3630@https.bugs.freebsd.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/
Auto-Submitted: auto-generated
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Archive: https://lists.freebsd.org/archives/freebsd-fs
List-Help: <mailto:freebsd-fs+help@freebsd.org>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Subscribe: <mailto:freebsd-fs+subscribe@freebsd.org>
List-Unsubscribe: <mailto:freebsd-fs+unsubscribe@freebsd.org>
Sender: owner-freebsd-fs@freebsd.org
MIME-Version: 1.0
X-ThisMailContainsUnwantedMimeParts: N

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D258208

--- Comment #20 from Mark Johnston <markj@FreeBSD.org> ---
(In reply to Andriy Gapon from comment #19)
Yeah, I was also wondering about changing the lock order.  I think that wou=
ld
fix the deadlock but this is getting kind of hairy.  Maybe we could busy the
mountpoint while sleeping on the teardown lock, but I'm not sure.

Taking a step back: zfs_rezget() is triggering the deadlock by busying page
cache pages, which it does so that it can purge cached data which would
otherwise become stale when the dataset is resumed.  But, it really only ne=
eds
to purge valid pages, invalid pages won't be mapped, and ZFS marks file pag=
es
as valid while holding the teardown lock.  The deadlock happens when
zfs_rezget() is purging _invalid_ pages that getpages is supposed to fill. =
 So
perhaps zfs_rezget() can simply ignore invalid pages.

I tried implementing this and it fixes the deadlock in my stress test, which
simply runs buildkernel in a loop and simultaneously rolls back the dataset=
 in
a loop.  This test also uncovered some UAFs, btw:

It's maybe a bit too hacky, since it means that we check the valid state of=
 a
page without busying it, and only the VM object lock is held.  This is ok f=
or
now at least: to mark a page valid, the page must be busied, but the object
lock _and_ busy lock are needed to mark a page invalid.  So if
vm_object_page_remove() encounters an invalid page, there is no guarantee t=
hat
it won't later become valid.  For ZFS I think it's safe to assume that vnode
pages only transition invalid->valid under the teardown lock, but that seems
like a delicate assumption...

The only other solution I can see is to add a new VOP to lock a vnode in
preparation for a getpages call.  This VOP could acquire the teardown lock,=
 so
we get a consistent lock order vnode->teardown->busy, and then we don't nee=
d to
deal with recursion.  It's not just the page fault handler which needs it
though, at least sendfile and the image activator need it as well.

--=20
You are receiving this mail because:
You are the assignee for the bug.=