From nobody Wed Jul 10 22:07:06 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WKBmB5Z8Bz5RFhh; Wed, 10 Jul 2024 22:07:06 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WKBmB54Pcz4L9M; Wed, 10 Jul 2024 22:07:06 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1720649226; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=apvHqVVhnwYqkB3SbKJwILbpb5QoZWruDcK/324HXIc=; b=NJXjuxLoYfiqoDMrZptvGpCcaI/Z2wTjvS1BQhBp2g9K185WjEsbxN+BImhpp1i+fqJAWm KhdsEeHcSVpuKmXqaIOi9AS19zYy3bFjqut9KpJPdB9e4I1l7YUN14XXNkWZesPFIrD7wy ptJjebKoPYYjih7Ze/GdK3dWSrzE2zcNx3Idus9EKybeSle/KQRfmwHa1hojLnAmFjmLBq sptuQv8N+gnOMDhU1+zIdq/3L9j+d17zXYby640SPGsH1CEHlyCsNsDQJ5z7iuKiShdAp4 Uea8XlsVMIz2vxPeyH1gNWtZIuxl5vzsNXbEbT12KvYiJBloMD23lsfjmTx4fw== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1720649226; a=rsa-sha256; cv=none; b=iClNkvFvB3K1FjdggY7bXuSy4pkovRSeWcvOgDKrLI2/NG5Hc9KYAyRYcxB2a1B+9h9wvI D1d7+1dJewa8un2FXSUoyzMNJujmJuFmpxViniDoVmu2f8JqpCXL9vopnZ/Teh/+Epx2z/ UTC/a+8+1Pp1SuK7E8tL6pRQDw10BjSvj3SBawDYqvMvY0xPLe3HbSa/lgDoX/6xKcXMvj DJAhgmcKPX5RJ1vrIpXgxkK566NXrjjM363ZmWRo3qWMhqpLB7LK8QYgyQydfmKF0AC30r sMoqkEJZu7ypixGDKk2cNG9bighnKfvG8jHlwstwVpGgCgnckRwJARiPZ7DkXg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1720649226; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=apvHqVVhnwYqkB3SbKJwILbpb5QoZWruDcK/324HXIc=; b=In5CA8sBxiMu6AjlMpb83P4t18+79cZntDO7V+PDadVlvuJ9+54cCoJbZuYjRf71jAKB5J onUSNL38KfsPzawCGWtB7470Xd2Mr4BH3AKWoRRXkl1OxSlplrdJU7OWK3rzEaogb4dato /tCAC+MUDHhndd+UziKS7b+rzAz9yxJOFglg7fyXHxkxP4KDwKQRME6DFlI7SllfYpck9h pZ3ajudLv98Iu2Sxb6bgU8Gr3PxWYr9pscoLZe8BBpKPKhH+A/1iCS57mrDXthWy6UCjiU wGvvnHKD3SHv8hnRePtTdBQHz/n8FJywXipnIS75njO6hxni8ONIEEVMr1HzHQ== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4WKBmB4hH7zb4N; Wed, 10 Jul 2024 22:07:06 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 46AM760e080239; Wed, 10 Jul 2024 22:07:06 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 46AM76XD080236; Wed, 10 Jul 2024 22:07:06 GMT (envelope-from git) Date: Wed, 10 Jul 2024 22:07:06 GMT Message-Id: <202407102207.46AM76XD080236@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Mateusz Guzik Subject: git: 1f0f120183db - stable/13 - vfs: make skipping LRU requeue optional List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mjg X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 1f0f120183db12e680107a0553a8de2d854aa757 Auto-Submitted: auto-generated The branch stable/13 has been updated by mjg: URL: https://cgit.FreeBSD.org/src/commit/?id=1f0f120183db12e680107a0553a8de2d854aa757 commit 1f0f120183db12e680107a0553a8de2d854aa757 Author: Mateusz Guzik AuthorDate: 2024-07-08 12:24:41 +0000 Commit: Mateusz Guzik CommitDate: 2024-07-10 22:06:15 +0000 vfs: make skipping LRU requeue optional As explained in the comment in the code it is a bottleneck in certain workloads. On the other hand it does not need to be skipped in most cases, while transiently running into the lock being contended happens a lot. (cherry picked from commit 0a9aa6fdf58468945240e86bf16c268acc8c1776) --- sys/kern/vfs_subr.c | 54 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 34 insertions(+), 20 deletions(-) diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c index 33232987705e..a1b4779b6d3f 100644 --- a/sys/kern/vfs_subr.c +++ b/sys/kern/vfs_subr.c @@ -223,6 +223,10 @@ static counter_u64_t vnode_skipped_requeues; SYSCTL_COUNTER_U64(_vfs_vnode_stats, OID_AUTO, skipped_requeues, CTLFLAG_RD, &vnode_skipped_requeues, "Number of times LRU requeue was skipped due to lock contention"); +static __read_mostly bool vnode_can_skip_requeue; +SYSCTL_BOOL(_vfs_vnode_param, OID_AUTO, can_skip_requeue, CTLFLAG_RW, + &vnode_can_skip_requeue, 0, "Is LRU requeue skippable"); + static u_long deferred_inact; SYSCTL_ULONG(_vfs, OID_AUTO, deferred_inact, CTLFLAG_RD, &deferred_inact, 0, "Number of times inactive processing was deferred"); @@ -3795,31 +3799,41 @@ vdbatch_process(struct vdbatch *vd) * lock contention, where vnode_list_mtx becomes the primary bottleneck * if multiple CPUs get here (one real-world example is highly parallel * do-nothing make , which will stat *tons* of vnodes). Since it is - * quasi-LRU (read: not that great even if fully honoured) just dodge - * the problem. Parties which don't like it are welcome to implement - * something better. + * quasi-LRU (read: not that great even if fully honoured) provide an + * option to just dodge the problem. Parties which don't like it are + * welcome to implement something better. */ - critical_enter(); - if (mtx_trylock(&vnode_list_mtx)) { - for (i = 0; i < VDBATCH_SIZE; i++) { - vp = vd->tab[i]; - vd->tab[i] = NULL; - TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); - TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist); - MPASS(vp->v_dbatchcpu != NOCPU); - vp->v_dbatchcpu = NOCPU; + if (vnode_can_skip_requeue) { + if (!mtx_trylock(&vnode_list_mtx)) { + counter_u64_add(vnode_skipped_requeues, 1); + critical_enter(); + for (i = 0; i < VDBATCH_SIZE; i++) { + vp = vd->tab[i]; + vd->tab[i] = NULL; + MPASS(vp->v_dbatchcpu != NOCPU); + vp->v_dbatchcpu = NOCPU; + } + vd->index = 0; + critical_exit(); + return; + } - mtx_unlock(&vnode_list_mtx); + /* fallthrough to locked processing */ } else { - counter_u64_add(vnode_skipped_requeues, 1); + mtx_lock(&vnode_list_mtx); + } - for (i = 0; i < VDBATCH_SIZE; i++) { - vp = vd->tab[i]; - vd->tab[i] = NULL; - MPASS(vp->v_dbatchcpu != NOCPU); - vp->v_dbatchcpu = NOCPU; - } + mtx_assert(&vnode_list_mtx, MA_OWNED); + critical_enter(); + for (i = 0; i < VDBATCH_SIZE; i++) { + vp = vd->tab[i]; + vd->tab[i] = NULL; + TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); + TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist); + MPASS(vp->v_dbatchcpu != NOCPU); + vp->v_dbatchcpu = NOCPU; } + mtx_unlock(&vnode_list_mtx); vd->index = 0; critical_exit(); }