From nobody Thu Aug 29 20:22:51 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Wvt572c5lz52Z5M; Thu, 29 Aug 2024 20:23:07 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 4Wvt560mnyz4Km9; Thu, 29 Aug 2024 20:23:06 +0000 (UTC) (envelope-from kostikbel@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: from tom.home (kib@localhost [127.0.0.1] (may be forged)) by kib.kiev.ua (8.18.1/8.18.1) with ESMTP id 47TKMp86094278; Thu, 29 Aug 2024 23:22:54 +0300 (EEST) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.10.3 kib.kiev.ua 47TKMp86094278 Received: (from kostik@localhost) by tom.home (8.18.1/8.18.1/Submit) id 47TKMpID094277; Thu, 29 Aug 2024 23:22:51 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 29 Aug 2024 23:22:51 +0300 From: Konstantin Belousov To: Mark Johnston Cc: src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: ff1ae3b3639d - main - rangelocks: restore caching of the single rl entry in the struct thread Message-ID: References: <202408060406.47646dvJ004709@gitrepo.freebsd.org> List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FORGED_GMAIL_RCVD,FREEMAIL_FROM, NML_ADSP_CUSTOM_MED autolearn=no autolearn_force=no version=4.0.1 X-Spam-Checker-Version: SpamAssassin 4.0.1 (2024-03-26) on tom.home X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:6939, ipnet:2001:470::/32, country:US] X-Rspamd-Queue-Id: 4Wvt560mnyz4Km9 On Thu, Aug 29, 2024 at 04:04:10PM -0400, Mark Johnston wrote: > On Tue, Aug 06, 2024 at 04:06:39AM +0000, Konstantin Belousov wrote: > > The branch main has been updated by kib: > > > > URL: https://cgit.FreeBSD.org/src/commit/?id=ff1ae3b3639d39a6485cfc655bf565cd04b9caa6 > > > > commit ff1ae3b3639d39a6485cfc655bf565cd04b9caa6 > > Author: Konstantin Belousov > > AuthorDate: 2023-08-23 23:29:50 +0000 > > Commit: Konstantin Belousov > > CommitDate: 2024-08-06 04:05:58 +0000 > > > > rangelocks: restore caching of the single rl entry in the struct thread > > > > Reviewed by: markj, Olivier Certner > > Tested by: pho > > Sponsored by: The FreeBSD Foundation > > Differential revision: https://reviews.freebsd.org/D41787 > > --- > > sys/kern/kern_rangelock.c | 33 +++++++++++++++++++++++++++------ > > sys/kern/kern_thread.c | 1 + > > sys/sys/rangelock.h | 1 + > > 3 files changed, 29 insertions(+), 6 deletions(-) > > > > diff --git a/sys/kern/kern_rangelock.c b/sys/kern/kern_rangelock.c > > index 355b2125dd9b..2f16569b19aa 100644 > > --- a/sys/kern/kern_rangelock.c > > +++ b/sys/kern/kern_rangelock.c > > @@ -82,8 +82,15 @@ static struct rl_q_entry * > > rlqentry_alloc(vm_ooffset_t start, vm_ooffset_t end, int flags) > > { > > struct rl_q_entry *e; > > - > > - e = uma_zalloc_smr(rl_entry_zone, M_WAITOK); > > + struct thread *td; > > + > > + td = curthread; > > + if (td->td_rlqe != NULL) { > > + e = td->td_rlqe; > > + td->td_rlqe = NULL; > > + } else { > > + e = uma_zalloc_smr(rl_entry_zone, M_WAITOK); > > + } > > e->rl_q_next = NULL; > > e->rl_q_free = NULL; > > e->rl_q_start = start; > > @@ -95,6 +102,12 @@ rlqentry_alloc(vm_ooffset_t start, vm_ooffset_t end, int flags) > > return (e); > > } > > > > +void > > +rangelock_entry_free(struct rl_q_entry *e) > > +{ > > + uma_zfree_smr(rl_entry_zone, e); > > +} > > + > > void > > rangelock_init(struct rangelock *lock) > > { > > @@ -106,6 +119,7 @@ void > > rangelock_destroy(struct rangelock *lock) > > { > > struct rl_q_entry *e, *ep; > > + struct thread *td; > > > > MPASS(!lock->sleepers); > > for (e = (struct rl_q_entry *)atomic_load_ptr(&lock->head); > > @@ -386,8 +400,10 @@ rangelock_lock_int(struct rangelock *lock, bool trylock, vm_ooffset_t start, > > vm_ooffset_t end, int locktype) > > { > > struct rl_q_entry *e, *free, *x, *xp; > > + struct thread *td; > > enum RL_INSERT_RES res; > > > > + td = curthread; > > for (res = RL_LOCK_RETRY; res == RL_LOCK_RETRY;) { > > free = NULL; > > e = rlqentry_alloc(start, end, locktype); > > @@ -401,10 +417,15 @@ rangelock_lock_int(struct rangelock *lock, bool trylock, vm_ooffset_t start, > > e = NULL; > > } > > for (x = free; x != NULL; x = xp) { > > - MPASS(!rl_e_is_marked(x)); > > - xp = x->rl_q_free; > > - MPASS(!rl_e_is_marked(xp)); > > - uma_zfree_smr(rl_entry_zone, x); > > + MPASS(!rl_e_is_marked(x)); > > + xp = x->rl_q_free; > > + MPASS(!rl_e_is_marked(xp)); > > + if (td->td_rlqe == NULL) { > > + smr_synchronize(rl_smr); > > I think I had the impression that this was a rare case, but empirically > it is not. As far as I can see, this smr_synchronize() call happens > every time an entry is freed, which could be very frequent. > smr_synchronize() bumps the global sequence counter and blocks until all > CPUs have had a chance to observe the new value, so is quite expensive. > > I didn't try benchmarking yet, but pwrite3 from will-it-scale should be > a good candidate. > > Why do we maintain this per-thread cache at all? IMO it would make more > sense to unconditionally free the entry here. Or perhaps I'm missing > something here. Normally threads perform i/o syscalls, and for these syscalls, they need to perform just one rangelock op. In absence of the races with conflicting locks, the rangelock acquisition does not need another rl_q_entry item. In previous rangelock implementation, this was a significant improvement esp. on microbenchmarks where rangelock overhead could be reached, like repeated write or read of same single byte. Remembering that, I decided to keep the caching.