From nobody Thu Oct 28 16:03:38 2021 X-Original-To: fs@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 39364181EB39 for ; Thu, 28 Oct 2021 16:03:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R3" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Hg9Mt138jz4pP8 for ; Thu, 28 Oct 2021 16:03:38 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org (kenobi.freebsd.org [IPv6:2610:1c1:1:606c::50:1d]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id EDD8D26A04 for ; Thu, 28 Oct 2021 16:03:37 +0000 (UTC) (envelope-from bugzilla-noreply@freebsd.org) Received: from kenobi.freebsd.org ([127.0.1.5]) by kenobi.freebsd.org (8.15.2/8.15.2) with ESMTP id 19SG3bUi022918 for ; Thu, 28 Oct 2021 16:03:37 GMT (envelope-from bugzilla-noreply@freebsd.org) Received: (from www@localhost) by kenobi.freebsd.org (8.15.2/8.15.2/Submit) id 19SG3b1v022917 for fs@FreeBSD.org; Thu, 28 Oct 2021 16:03:37 GMT (envelope-from bugzilla-noreply@freebsd.org) X-Authentication-Warning: kenobi.freebsd.org: www set sender to bugzilla-noreply@freebsd.org using -f From: bugzilla-noreply@freebsd.org To: fs@FreeBSD.org Subject: [Bug 258208] [zfs] locks up when using rollback or destroy on both 13.0-RELEASE & sysutils/openzfs port Date: Thu, 28 Oct 2021 16:03:38 +0000 X-Bugzilla-Reason: AssignedTo X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: Base System X-Bugzilla-Component: bin X-Bugzilla-Version: 13.0-RELEASE X-Bugzilla-Keywords: X-Bugzilla-Severity: Affects Only Me X-Bugzilla-Who: markj@FreeBSD.org X-Bugzilla-Status: Open X-Bugzilla-Resolution: X-Bugzilla-Priority: --- X-Bugzilla-Assigned-To: fs@FreeBSD.org X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: https://bugs.freebsd.org/bugzilla/ Auto-Submitted: auto-generated List-Id: Filesystems List-Archive: https://lists.freebsd.org/archives/freebsd-fs List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-fs@freebsd.org MIME-Version: 1.0 X-ThisMailContainsUnwantedMimeParts: N https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D258208 --- Comment #20 from Mark Johnston --- (In reply to Andriy Gapon from comment #19) Yeah, I was also wondering about changing the lock order. I think that wou= ld fix the deadlock but this is getting kind of hairy. Maybe we could busy the mountpoint while sleeping on the teardown lock, but I'm not sure. Taking a step back: zfs_rezget() is triggering the deadlock by busying page cache pages, which it does so that it can purge cached data which would otherwise become stale when the dataset is resumed. But, it really only ne= eds to purge valid pages, invalid pages won't be mapped, and ZFS marks file pag= es as valid while holding the teardown lock. The deadlock happens when zfs_rezget() is purging _invalid_ pages that getpages is supposed to fill. = So perhaps zfs_rezget() can simply ignore invalid pages. I tried implementing this and it fixes the deadlock in my stress test, which simply runs buildkernel in a loop and simultaneously rolls back the dataset= in a loop. This test also uncovered some UAFs, btw: It's maybe a bit too hacky, since it means that we check the valid state of= a page without busying it, and only the VM object lock is held. This is ok f= or now at least: to mark a page valid, the page must be busied, but the object lock _and_ busy lock are needed to mark a page invalid. So if vm_object_page_remove() encounters an invalid page, there is no guarantee t= hat it won't later become valid. For ZFS I think it's safe to assume that vnode pages only transition invalid->valid under the teardown lock, but that seems like a delicate assumption... The only other solution I can see is to add a new VOP to lock a vnode in preparation for a getpages call. This VOP could acquire the teardown lock,= so we get a consistent lock order vnode->teardown->busy, and then we don't nee= d to deal with recursion. It's not just the page fault handler which needs it though, at least sendfile and the image activator need it as well. --=20 You are receiving this mail because: You are the assignee for the bug.=