From nobody Mon Jul 08 12:40:44 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WHkHd1DM2z5Q64v; Mon, 08 Jul 2024 12:40:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WHkHd0ghCz4gNP; Mon, 8 Jul 2024 12:40:45 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1720442445; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=nORL/NxRyaotdN4bFGAWy/6lDXYj+3hD1BehcN0lasE=; b=jConAUw/z513Pp8JcJy3ocFHT3foydgt0LYThQ3ek3CwK04GL9aFV2Vf80DaHbBhnqLqJd DhEh4gOALy3v06UCn3lsMNODteaPFy2h4S4QZjYa29zrQp4RCFtMCBLwuIPvATbb8WWo3H rVtx4ZW+k9mnagEshh7b49LlsnWGHAyeV8+24tFRYO3k3UE2GumvINQsYFvbtkxYNQ53px NZiX5QCHsVLBVx66/OWpvKwUdGK4uU9a/NpQALrUuG2hFjzCeL1QKRvpNjiXe0Edlkb5WR dL9CMdk0X72ljVzI1I9zqjvkvmTnD4zW93E6yCYrxnwcZ+mSdG5Vthp3xHfvhQ== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1720442445; a=rsa-sha256; cv=none; b=LZP9PL1zTPrgu8Dwi/6P6vDKx/f3r+l3gg4C1RMUbf6Cb7IBY06M4qzKPLMKqVpxWIksPr CnZZAmdpBp/F01Mlq3j67iXvfZ7M3V5G5axYITAyldfDsF2yKjAJ3POC52z7UCaR2aSpmt IEONwqMoyLrL1oYd2DLHGfLFwkMg8/d0dv7jPT6eJbArmbas8AZBiwU7DNscCl49FAGaSE rd9CvWmPsT8bcxC50ySNDCzXveoImgIIRQ1q3OAqrmXyX7ZHxP75VdO9uaeHL9zG4PCdhy LdCjEuK0jZXOjSBYcdlLG3qGoqM26tt5faJRYEhulakQL0dpDtdYFB1o+AaE8Q== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1720442445; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=nORL/NxRyaotdN4bFGAWy/6lDXYj+3hD1BehcN0lasE=; b=q0R4cpTI9+ZOnq9BzYReMfq0E/ZS+oHNZ7x5gcu2OmpekAcM84Trd1hTtkv8uzmkERDOlu fRBlbXNfjPjfEsX24Jiq6Rt1b72+TD9NmzVuHK58EK1vTKCg7q7qgBDJ2Fnb+QihOz8A1c O78OqNWUAevdZlVoV185LzS7oxlK8zAgNac/GNZFNOUmvWZrqeOXSo8Cq1c7x023ronTZi WGrpCnNGzTdyirqrOIzyPSJpDZlcvDAPWg4b4N05z06Ir1GiaVmPC1U7Nm1B7Y9DTGI8vI U6jn9AT0oAc1juoN+PhpqkycLofMfxovwhDrkskIJUxoxTWetME0HRw4S+Bvcg== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4WHkHd0HC1zs4C; Mon, 8 Jul 2024 12:40:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 468Ceidx020248; Mon, 8 Jul 2024 12:40:44 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 468CeiqP020245; Mon, 8 Jul 2024 12:40:44 GMT (envelope-from git) Date: Mon, 8 Jul 2024 12:40:44 GMT Message-Id: <202407081240.468CeiqP020245@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-main@FreeBSD.org From: Mateusz Guzik Subject: git: 0a9aa6fdf584 - main - vfs: make skipping LRU requeue optional List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: mjg X-Git-Repository: src X-Git-Refname: refs/heads/main X-Git-Reftype: branch X-Git-Commit: 0a9aa6fdf58468945240e86bf16c268acc8c1776 Auto-Submitted: auto-generated The branch main has been updated by mjg: URL: https://cgit.FreeBSD.org/src/commit/?id=0a9aa6fdf58468945240e86bf16c268acc8c1776 commit 0a9aa6fdf58468945240e86bf16c268acc8c1776 Author: Mateusz Guzik AuthorDate: 2024-07-08 12:24:41 +0000 Commit: Mateusz Guzik CommitDate: 2024-07-08 12:40:20 +0000 vfs: make skipping LRU requeue optional As explained in the comment in the code it is a bottleneck in certain workloads. On the other hand it does not need to be skipped in most cases, while transiently running into the lock being contended happens a lot. --- sys/kern/vfs_subr.c | 54 +++++++++++++++++++++++++++++++++-------------------- 1 file changed, 34 insertions(+), 20 deletions(-) diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c index 52712b99abac..8012fab29081 100644 --- a/sys/kern/vfs_subr.c +++ b/sys/kern/vfs_subr.c @@ -222,6 +222,10 @@ static counter_u64_t vnode_skipped_requeues; SYSCTL_COUNTER_U64(_vfs_vnode_stats, OID_AUTO, skipped_requeues, CTLFLAG_RD, &vnode_skipped_requeues, "Number of times LRU requeue was skipped due to lock contention"); +static __read_mostly bool vnode_can_skip_requeue; +SYSCTL_BOOL(_vfs_vnode_param, OID_AUTO, can_skip_requeue, CTLFLAG_RW, + &vnode_can_skip_requeue, 0, "Is LRU requeue skippable"); + static u_long deferred_inact; SYSCTL_ULONG(_vfs, OID_AUTO, deferred_inact, CTLFLAG_RD, &deferred_inact, 0, "Number of times inactive processing was deferred"); @@ -3835,31 +3839,41 @@ vdbatch_process(struct vdbatch *vd) * lock contention, where vnode_list_mtx becomes the primary bottleneck * if multiple CPUs get here (one real-world example is highly parallel * do-nothing make , which will stat *tons* of vnodes). Since it is - * quasi-LRU (read: not that great even if fully honoured) just dodge - * the problem. Parties which don't like it are welcome to implement - * something better. + * quasi-LRU (read: not that great even if fully honoured) provide an + * option to just dodge the problem. Parties which don't like it are + * welcome to implement something better. */ - critical_enter(); - if (mtx_trylock(&vnode_list_mtx)) { - for (i = 0; i < VDBATCH_SIZE; i++) { - vp = vd->tab[i]; - vd->tab[i] = NULL; - TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); - TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist); - MPASS(vp->v_dbatchcpu != NOCPU); - vp->v_dbatchcpu = NOCPU; + if (vnode_can_skip_requeue) { + if (!mtx_trylock(&vnode_list_mtx)) { + counter_u64_add(vnode_skipped_requeues, 1); + critical_enter(); + for (i = 0; i < VDBATCH_SIZE; i++) { + vp = vd->tab[i]; + vd->tab[i] = NULL; + MPASS(vp->v_dbatchcpu != NOCPU); + vp->v_dbatchcpu = NOCPU; + } + vd->index = 0; + critical_exit(); + return; + } - mtx_unlock(&vnode_list_mtx); + /* fallthrough to locked processing */ } else { - counter_u64_add(vnode_skipped_requeues, 1); + mtx_lock(&vnode_list_mtx); + } - for (i = 0; i < VDBATCH_SIZE; i++) { - vp = vd->tab[i]; - vd->tab[i] = NULL; - MPASS(vp->v_dbatchcpu != NOCPU); - vp->v_dbatchcpu = NOCPU; - } + mtx_assert(&vnode_list_mtx, MA_OWNED); + critical_enter(); + for (i = 0; i < VDBATCH_SIZE; i++) { + vp = vd->tab[i]; + vd->tab[i] = NULL; + TAILQ_REMOVE(&vnode_list, vp, v_vnodelist); + TAILQ_INSERT_TAIL(&vnode_list, vp, v_vnodelist); + MPASS(vp->v_dbatchcpu != NOCPU); + vp->v_dbatchcpu = NOCPU; } + mtx_unlock(&vnode_list_mtx); vd->index = 0; critical_exit(); }