From nobody Mon Sep 02 09:12:45 2024 X-Original-To: dev-commits-src-all@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4Wy31n5n02z5Mqlk; Mon, 02 Sep 2024 09:12:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from mxrelay.nyi.freebsd.org (mxrelay.nyi.freebsd.org [IPv6:2610:1c1:1:606c::19:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "mxrelay.nyi.freebsd.org", Issuer "R11" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4Wy31n4ZPnz4Cbb; Mon, 2 Sep 2024 09:12:45 +0000 (UTC) (envelope-from git@FreeBSD.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725268365; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=V9IIdhlqY0nHVOWZ58Qqy+fW3ge1/F6ZKsBYYdpJK6g=; b=sbSrcnqbwRwlEtEF3JJzAWDg9RyT9Jh9t8fHSp2ULK03PPQs7+A7dRQ26gEInbF1Bqso4/ XwMFAyMxPIsckVcj4jOsRHMa5+tzx92hcQmqomyLHmeH0Go55iT7KE/VWfCJVDSu+nNbUp XPmzQiFdp1NINbHUk92hXQOIro+F+beuBiETonSHJ6OKFzLPStFInL5NwQfUKhkAFoBTWk hND3NpwjHnBS2oXafPIQaig6fqFgDDsm8QpSPRI7KkwAiPJYzZlEWSw8T7EUBZqLNfS06t t2OBYv2T8J1txZZXDzpFZzJGRsdrA/acbNHcv/G7VDxp5Q7Zo0G5ZPTI3GAD9w== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1725268365; a=rsa-sha256; cv=none; b=TmO092PxinnX3yfdG7tcAy0IZFgRje0XSKZsmxNTKlgvcOwCL+KyCbCwmD6/H/IavAnDNG xFgcWrQsfCM3pnZZhtBxfpmM1dvXCHgmi8YMo889DqMpqf/Pg2vOVVFIq4m/ENyBCFh7Jl vzWghYdX0h03oO3q0E8XOdl3pmmqtYvoT/aDLSV7Oe5DLmBwT8dbF7sZ2uRMIZKcYlakni fIu6sj2hzrKYqgj41O0EaSRAmJ+RzKF4bMYvq+UJYKf5t/aODJo2wRlmmMLT3GVInElaNh T+3bf6N7OB6Iw3/11u/zi88lNRyiCUc7xXqFKO/WXwFzjwoy6e+ehsvOG19kSA== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1725268365; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=V9IIdhlqY0nHVOWZ58Qqy+fW3ge1/F6ZKsBYYdpJK6g=; b=i7s+Ihou+mWQGDcmANe7W8zYzuporAsWKSTYEQJOP6KIMpCrviX9310tIp+umCenUG5CYs udMJRG8FXZZnekTi6fToIN70W9GQptgQF/rtXVnFqy6m1gL58FRTo2M28dGKVlnXe3z+H5 wkdBJXDv51WPBjmUu0SlkUWJhSgeh/hV2Ctdc/i61tTAG/T8/d1ppWUJP8/cZJvXT2kteP uSQOM9Z7lTVLeE6+9uKCLp6Aktf1KMsww5k6EswT0SHV9Sa4z5CMQ1oL+s0+HPteUu3nC0 8ryoEYtybtXbEEsa9kQ4rrDS3mRj7ot+BxH459ldHe7u3NTOLoPQp9F6JOjPdw== Received: from gitrepo.freebsd.org (gitrepo.freebsd.org [IPv6:2610:1c1:1:6068::e6a:5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) by mxrelay.nyi.freebsd.org (Postfix) with ESMTPS id 4Wy31n47mwz16Dm; Mon, 2 Sep 2024 09:12:45 +0000 (UTC) (envelope-from git@FreeBSD.org) Received: from gitrepo.freebsd.org ([127.0.1.44]) by gitrepo.freebsd.org (8.18.1/8.18.1) with ESMTP id 4829CjwN049888; Mon, 2 Sep 2024 09:12:45 GMT (envelope-from git@gitrepo.freebsd.org) Received: (from git@localhost) by gitrepo.freebsd.org (8.18.1/8.18.1/Submit) id 4829CjQf049885; Mon, 2 Sep 2024 09:12:45 GMT (envelope-from git) Date: Mon, 2 Sep 2024 09:12:45 GMT Message-Id: <202409020912.4829CjQf049885@gitrepo.freebsd.org> To: src-committers@FreeBSD.org, dev-commits-src-all@FreeBSD.org, dev-commits-src-branches@FreeBSD.org From: Andrew Turner Subject: git: 7ab139235534 - stable/13 - buf_ring: Keep the full head and tail values List-Id: Commit messages for all branches of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-all List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-all@freebsd.org Sender: owner-dev-commits-src-all@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-Git-Committer: andrew X-Git-Repository: src X-Git-Refname: refs/heads/stable/13 X-Git-Reftype: branch X-Git-Commit: 7ab139235534ce9cdcafc5f33f312e3def501b29 Auto-Submitted: auto-generated The branch stable/13 has been updated by andrew: URL: https://cgit.FreeBSD.org/src/commit/?id=7ab139235534ce9cdcafc5f33f312e3def501b29 commit 7ab139235534ce9cdcafc5f33f312e3def501b29 Author: Andrew Turner AuthorDate: 2024-08-19 09:06:44 +0000 Commit: Andrew Turner CommitDate: 2024-09-02 09:11:57 +0000 buf_ring: Keep the full head and tail values If a thread reads the head but then sleeps for long enough that another thread fills the ring and leaves the new head with the expected value then the cmpset can pass when it should have failed. To work around this keep the full head and tail value and use the upper bits as a generation count. Reviewed by: kib Sponsored by: Arm Ltd Differential Revision: https://reviews.freebsd.org/D46151 (cherry picked from commit 3cc603909e09c958e20dd5a8a341f62f29e33a07) --- sys/sys/buf_ring.h | 87 +++++++++++++++++++++++++++++++++++------------------- 1 file changed, 56 insertions(+), 31 deletions(-) diff --git a/sys/sys/buf_ring.h b/sys/sys/buf_ring.h index 39c75dd5b09f..dec0f971ae44 100644 --- a/sys/sys/buf_ring.h +++ b/sys/sys/buf_ring.h @@ -40,6 +40,15 @@ #include #endif +/* + * We only apply the mask to the head and tail values when calculating the + * index into br_ring to access. This means the upper bits can be used as + * epoch to reduce the chance the atomic_cmpset succeedes when it should + * fail, e.g. when the head wraps while the CPU is in an interrupt. This + * is a probablistic fix as there is still a very unlikely chance the + * value wraps back to the expected value. + * + */ struct buf_ring { volatile uint32_t br_prod_head; volatile uint32_t br_prod_tail; @@ -63,28 +72,28 @@ struct buf_ring { static __inline int buf_ring_enqueue(struct buf_ring *br, void *buf) { - uint32_t prod_head, prod_next, cons_tail; -#ifdef DEBUG_BUFRING - int i; + uint32_t prod_head, prod_next, prod_idx; + uint32_t cons_tail, mask; + mask = br->br_prod_mask; +#ifdef DEBUG_BUFRING /* * Note: It is possible to encounter an mbuf that was removed * via drbr_peek(), and then re-added via drbr_putback() and * trigger a spurious panic. */ - for (i = br->br_cons_head; i != br->br_prod_head; - i = ((i + 1) & br->br_cons_mask)) - if(br->br_ring[i] == buf) + for (uint32_t i = br->br_cons_head; i != br->br_prod_head; i++) + if (br->br_ring[i & mask] == buf) panic("buf=%p already enqueue at %d prod=%d cons=%d", buf, i, br->br_prod_tail, br->br_cons_tail); #endif critical_enter(); do { prod_head = br->br_prod_head; - prod_next = (prod_head + 1) & br->br_prod_mask; + prod_next = prod_head + 1; cons_tail = br->br_cons_tail; - if (prod_next == cons_tail) { + if ((int32_t)(cons_tail + br->br_prod_size - prod_next) < 1) { rmb(); if (prod_head == br->br_prod_head && cons_tail == br->br_cons_tail) { @@ -95,11 +104,12 @@ buf_ring_enqueue(struct buf_ring *br, void *buf) continue; } } while (!atomic_cmpset_acq_32(&br->br_prod_head, prod_head, prod_next)); + prod_idx = prod_head & mask; #ifdef DEBUG_BUFRING - if (br->br_ring[prod_head] != NULL) + if (br->br_ring[prod_idx] != NULL) panic("dangling value in enqueue"); #endif - br->br_ring[prod_head] = buf; + br->br_ring[prod_idx] = buf; /* * If there are other enqueues in progress @@ -120,23 +130,26 @@ buf_ring_enqueue(struct buf_ring *br, void *buf) static __inline void * buf_ring_dequeue_mc(struct buf_ring *br) { - uint32_t cons_head, cons_next; + uint32_t cons_head, cons_next, cons_idx; + uint32_t mask; void *buf; critical_enter(); + mask = br->br_cons_mask; do { cons_head = br->br_cons_head; - cons_next = (cons_head + 1) & br->br_cons_mask; + cons_next = cons_head + 1; if (cons_head == br->br_prod_tail) { critical_exit(); return (NULL); } } while (!atomic_cmpset_acq_32(&br->br_cons_head, cons_head, cons_next)); + cons_idx = cons_head & mask; - buf = br->br_ring[cons_head]; + buf = br->br_ring[cons_idx]; #ifdef DEBUG_BUFRING - br->br_ring[cons_head] = NULL; + br->br_ring[cons_idx] = NULL; #endif /* * If there are other dequeues in progress @@ -160,8 +173,8 @@ buf_ring_dequeue_mc(struct buf_ring *br) static __inline void * buf_ring_dequeue_sc(struct buf_ring *br) { - uint32_t cons_head, cons_next; - uint32_t prod_tail; + uint32_t cons_head, cons_next, cons_idx; + uint32_t prod_tail, mask; void *buf; /* @@ -189,6 +202,7 @@ buf_ring_dequeue_sc(struct buf_ring *br) * * <1> Load (on core 1) from br->br_ring[cons_head] can be reordered (speculative readed) by CPU. */ + mask = br->br_cons_mask; #if defined(__arm__) || defined(__aarch64__) cons_head = atomic_load_acq_32(&br->br_cons_head); #else @@ -196,16 +210,17 @@ buf_ring_dequeue_sc(struct buf_ring *br) #endif prod_tail = atomic_load_acq_32(&br->br_prod_tail); - cons_next = (cons_head + 1) & br->br_cons_mask; + cons_next = cons_head + 1; - if (cons_head == prod_tail) + if (cons_head == prod_tail) return (NULL); + cons_idx = cons_head & mask; br->br_cons_head = cons_next; - buf = br->br_ring[cons_head]; + buf = br->br_ring[cons_idx]; #ifdef DEBUG_BUFRING - br->br_ring[cons_head] = NULL; + br->br_ring[cons_idx] = NULL; #ifdef _KERNEL if (!mtx_owned(br->br_lock)) panic("lock not held on single consumer dequeue"); @@ -226,18 +241,21 @@ buf_ring_dequeue_sc(struct buf_ring *br) static __inline void buf_ring_advance_sc(struct buf_ring *br) { - uint32_t cons_head, cons_next; - uint32_t prod_tail; + uint32_t cons_head, cons_next, prod_tail; +#ifdef DEBUG_BUFRING + uint32_t mask; + mask = br->br_cons_mask; +#endif cons_head = br->br_cons_head; prod_tail = br->br_prod_tail; - cons_next = (cons_head + 1) & br->br_cons_mask; - if (cons_head == prod_tail) + cons_next = cons_head + 1; + if (cons_head == prod_tail) return; br->br_cons_head = cons_next; #ifdef DEBUG_BUFRING - br->br_ring[cons_head] = NULL; + br->br_ring[cons_head & mask] = NULL; #endif br->br_cons_tail = cons_next; } @@ -261,9 +279,12 @@ buf_ring_advance_sc(struct buf_ring *br) static __inline void buf_ring_putback_sc(struct buf_ring *br, void *new) { - KASSERT(br->br_cons_head != br->br_prod_tail, + uint32_t mask; + + mask = br->br_cons_mask; + KASSERT((br->br_cons_head & mask) != (br->br_prod_tail & mask), ("Buf-Ring has none in putback")) ; - br->br_ring[br->br_cons_head] = new; + br->br_ring[br->br_cons_head & mask] = new; } /* @@ -274,11 +295,13 @@ buf_ring_putback_sc(struct buf_ring *br, void *new) static __inline void * buf_ring_peek(struct buf_ring *br) { + uint32_t mask; #if defined(DEBUG_BUFRING) && defined(_KERNEL) if ((br->br_lock != NULL) && !mtx_owned(br->br_lock)) panic("lock not held on single consumer dequeue"); #endif + mask = br->br_cons_mask; /* * I believe it is safe to not have a memory barrier * here because we control cons and tail is worst case @@ -288,12 +311,13 @@ buf_ring_peek(struct buf_ring *br) if (br->br_cons_head == br->br_prod_tail) return (NULL); - return (br->br_ring[br->br_cons_head]); + return (br->br_ring[br->br_cons_head & mask]); } static __inline void * buf_ring_peek_clear_sc(struct buf_ring *br) { + uint32_t mask; void *ret; #if defined(DEBUG_BUFRING) && defined(_KERNEL) @@ -301,6 +325,7 @@ buf_ring_peek_clear_sc(struct buf_ring *br) panic("lock not held on single consumer dequeue"); #endif + mask = br->br_cons_mask; if (br->br_cons_head == br->br_prod_tail) return (NULL); @@ -318,13 +343,13 @@ buf_ring_peek_clear_sc(struct buf_ring *br) atomic_thread_fence_acq(); #endif - ret = br->br_ring[br->br_cons_head]; + ret = br->br_ring[br->br_cons_head & mask]; #ifdef DEBUG_BUFRING /* * Single consumer, i.e. cons_head will not move while we are * running, so atomic_swap_ptr() is not necessary here. */ - br->br_ring[br->br_cons_head] = NULL; + br->br_ring[br->br_cons_head & mask] = NULL; #endif return (ret); } @@ -333,7 +358,7 @@ static __inline int buf_ring_full(struct buf_ring *br) { - return (((br->br_prod_head + 1) & br->br_prod_mask) == br->br_cons_tail); + return (br->br_prod_head == br->br_cons_tail + br->br_cons_size - 1); } static __inline int