From nobody Tue Nov 12 18:41:11 2024 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4XnwGy3Xy3z5cwPJ; Tue, 12 Nov 2024 18:41:14 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Received: from omta002.cacentral1.a.cloudfilter.net (omta002.cacentral1.a.cloudfilter.net [3.97.99.33]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client CN "Client", Issuer "CA" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 4XnwGx6QKpz4vZb; Tue, 12 Nov 2024 18:41:13 +0000 (UTC) (envelope-from cy.schubert@cschubert.com) Authentication-Results: mx1.freebsd.org; none Received: from shw-obgw-4003a.ext.cloudfilter.net ([10.228.9.183]) by cmsmtp with ESMTPS id Aq62tmdQqMArNAvpBtub3Z; Tue, 12 Nov 2024 18:41:13 +0000 Received: from spqr.komquats.com ([70.66.152.170]) by cmsmtp with ESMTPSA id Avp9tAMkzE0IVAvpAtjnlu; Tue, 12 Nov 2024 18:41:13 +0000 X-Auth-User: cschuber X-Authority-Analysis: v=2.4 cv=cI9DsUeN c=1 sm=1 tr=0 ts=6733a149 a=y8EK/9tc/U6QY+pUhnbtgQ==:117 a=y8EK/9tc/U6QY+pUhnbtgQ==:17 a=kj9zAlcOel0A:10 a=VlfZXiiP6vEA:10 a=YxBL1-UpAAAA:8 a=6I5d2MoRAAAA:8 a=EkcXrb_YAAAA:8 a=VxmjJ2MpAAAA:8 a=VpgKYua3uFmnTG855akA:9 a=CjuIK1q_8ugA:10 a=Ia-lj3WSrqcvXOmTRaiG:22 a=LK5xJRSDVpKd5WXXoEvA:22 a=7gXAzLPJhVmCkEl4_tsf:22 Received: from slippy.cwsent.com (slippy [10.1.1.91]) by spqr.komquats.com (Postfix) with ESMTP id 31306C72; Tue, 12 Nov 2024 10:41:11 -0800 (PST) Received: by slippy.cwsent.com (Postfix, from userid 1000) id 2CCA53C2; Tue, 12 Nov 2024 10:41:11 -0800 (PST) X-Mailer: exmh version 2.9.0 11/07/2018 with nmh-1.8+dev Reply-to: Cy Schubert From: Cy Schubert X-os: FreeBSD X-Sender: cy@cwsent.com X-URL: http://www.cschubert.com/ To: jlduran@freebsd.org cc: Cy Schubert , John Baldwin , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 1fa6daaafd74 - main - ipfilter: Avoid stopping with a lock held In-reply-to: References: <202411120347.4AC3lPOA015167@gitrepo.freebsd.org> <09923ad3-4065-4a31-b35f-74f84e09cff1@FreeBSD.org> <20241112155219.2E15A2DF@slippy.cwsent.com> <20241112160243.B372532F@slippy.cwsent.com> Comments: In-reply-to Jose Luis Duran message dated "Tue, 12 Nov 2024 15:35:55 -0300." List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 12 Nov 2024 10:41:11 -0800 Message-Id: <20241112184111.2CCA53C2@slippy.cwsent.com> X-CMAE-Envelope: MS4xfEALqe26UzI6jarQWfOFMS0gQSjyLd5CrgPt8t355PNl3F5MvfQu31jTe487gC1py2ZE/GyCHWt23ZDUSC3xVY3K7MfgjYlpjJr2JDgtPV5LD7ff3znc jXP6K9U50kRzJhhL8rchZ4B+EqyAYRtJTOg/Mw/LBWD8l+ZNrxfX5/hVi37UwjDq6Cq5sexqGAu3Tb5HPZG5h4kwqj0Q8D5jsSbTs6kSriFEK1Q5Ct3w5tbn smeRGgxLeDmVovJyPM0BqYvQbytfXIR0UhirKxhwhQFpQIXO8zdoxWEWFlcxliERcwMWKHSG4lgplybGcqx1vxVetCw4e+z8boGHzaG4rfsFX8l3y+bYResN +LRzrsxt X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; ASN(0.00)[asn:16509, ipnet:3.96.0.0/15, country:US] X-Rspamd-Queue-Id: 4XnwGx6QKpz4vZb X-Spamd-Bar: ---- I'm building a test patch on my sandbox machine. The patches in my git repo will, a) remove another #if 0 from 2013, b) revert this commit, c) fix the "stopping with lock held" bug Testing on the sandbox is needed before I submit a review. You can revert this if you want, or I can push my three commits when my testing is complete. -- Cheers, Cy Schubert FreeBSD UNIX: Web: https://FreeBSD.org NTP: Web: https://nwtime.org e^(i*pi)+1=0 In message , Jose Luis Duran writes: > What would be the recommended approach in this case? > Revert, and commit with the fixes, or just fix the problematic code. > > At any rate, I have created https://reviews.freebsd.org/D47530. Feel > free to commandeer it, as my testing is limited to what's currently in > the "Test Plan". I was not able to recreate the deadlock described > here, but it definitely makes sense. > > On Tue, Nov 12, 2024 at 1:02=E2=80=AFPM Cy Schubert com> wrote: > > > > In message <20241112155219.2E15A2DF@slippy.cwsent.com>, Cy Schubert write= > s: > > > In message <09923ad3-4065-4a31-b35f-74f84e09cff1@FreeBSD.org>, John Bal= > dwin > > > wri > > > tes: > > > > On 11/11/24 19:47, Jose Luis Duran wrote: > > > > > The branch main has been updated by jlduran: > > > > > > > > > > URL: https://cgit.FreeBSD.org/src/commit/?id=3D1fa6daaafd74c1a457dc= > fe26e0a5 > > > 94 > > > > 3b5441dc9d > > > > > > > > > > commit 1fa6daaafd74c1a457dcfe26e0a5943b5441dc9d > > > > > Author: Jose Luis Duran > > > > > AuthorDate: 2024-11-02 17:58:59 +0000 > > > > > Commit: Jose Luis Duran > > > > > CommitDate: 2024-11-12 03:46:15 +0000 > > > > > > > > > > ipfilter: Avoid stopping with a lock held > > > > > > > > > > Avoid calling _callout_stop_safe with a non-sleepable lock hel= > d when > > > > > detaching by initializing callout_init_rw() with CALLOUT_SHARE= > DLOCK. > > > > > > > > > > It avoids the following WITNESS warning when stopping the serv= > ice: > > > > > > > > > > # service ipfilter stop > > > > > calling _callout_stop_safe with the following non-sleepabl= > e lock > > > s > > > > held: > > > > > shared rw ipf filter load/unload mutex (ipf filter load/un= > load m > > > ut > > > > ex) r =3D 0 (0xffff0000417c7530) locked @ /usr/src/sys/netpfil/ipfilt= > er/netin > > > et > > > > /fil.c:7926 > > > > > stack backtrace: > > > > > #0 0xffff00000052d394 at witness_debugger+0x60 > > > > > #1 0xffff00000052e620 at witness_warn+0x404 > > > > > #2 0xffff0000004d4ffc at _callout_stop_safe+0x8c > > > > > #3 0xffff0000f7236674 at ipfdetach+0x3c > > > > > #4 0xffff0000f723fa4c at ipf_ipf_ioctl+0x788 > > > > > #5 0xffff0000f72367e0 at ipfioctl+0x144 > > > > > #6 0xffff00000034abd8 at devfs_ioctl+0x100 > > > > > #7 0xffff0000005c66a0 at vn_ioctl+0xbc > > > > > #8 0xffff00000034b2cc at devfs_ioctl_f+0x24 > > > > > #9 0xffff0000005331ec at kern_ioctl+0x2e0 > > > > > #10 0xffff000000532eb4 at sys_ioctl+0x140 > > > > > #11 0xffff000000880480 at do_el0_sync+0x604 > > > > > #12 0xffff0000008579ac at handle_el0_sync+0x4c > > > > > > > > > > PR: 282478 > > > > > Suggested by: markj > > > > > Reviewed by: cy > > > > > Approved by: emaste (mentor) > > > > > MFC after: 1 week > > > > > --- > > > > > sys/netpfil/ipfilter/netinet/ip_fil_freebsd.c | 4 ++-- > > > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/sys/netpfil/ipfilter/netinet/ip_fil_freebsd.c b/sys/ne= > tpfil/ > > > ip > > > > filter/netinet/ip_fil_freebsd.c > > > > > index bcde0d2c7323..b3dea40c3d8c 100644 > > > > > --- a/sys/netpfil/ipfilter/netinet/ip_fil_freebsd.c > > > > > +++ b/sys/netpfil/ipfilter/netinet/ip_fil_freebsd.c > > > > > @@ -181,7 +181,7 @@ ipf_timer_func(void *arg) > > > > > #if 0 > > > > > softc->ipf_slow_ch =3D timeout(ipf_timer_func, so= > ftc, hz/ > > > 2); > > > > > #endif > > > > > - callout_init(&softc->ipf_slow_ch, 1); > > > > > + callout_init_rw(&softc->ipf_slow_ch, &softc->ipf_global.i= > pf_lk, > > > > CALLOUT_SHAREDLOCK); > > > > > callout_reset(&softc->ipf_slow_ch, > > > > > (hz / IPF_HZ_DIVIDE) * IPF_HZ_MULT, > > > > > ipf_timer_func, softc); > > > > > @@ -221,7 +221,7 @@ ipfattach(ipf_main_softc_t *softc) > > > > > softc->ipf_slow_ch =3D timeout(ipf_timer_func, softc, > > > > > (hz / IPF_HZ_DIVIDE) * IPF_H= > Z_MULT > > > ); > > > > > #endif > > > > > - callout_init(&softc->ipf_slow_ch, 1); > > > > > + callout_init_rw(&softc->ipf_slow_ch, &softc->ipf_global.ipf_lk, C= > ALLOUT > > > > _SHAREDLOCK); > > > > > callout_reset(&softc->ipf_slow_ch, (hz / IPF_HZ_DIVIDE) *= > IPF_H > > > Z_MULT, > > > > > ipf_timer_func, softc); > > > > > return (0); > > > > > > > > But this means the callout is now called with the lock held, and I do= > n't se > > > e > > > > any changes > > > > to ipf_timer_func. Hmm, so you now recurse on the lock? You need to= > take > > > ou > > > > t the locking > > > > in ipf_timer_func I think as it is now spurious. You can also axe th= > e #if > > > 0' > > > > d code around > > > > timeout() while you are at it. > > > > > > Reviewing all the #if 0's in ipfilter, discovering why they are there, = > and > > > either removing them or implementing what Darren had initially intended= > is > > > in my queue. The upstream git log isn't helpful for reasons I preach ab= > out > > > proper commit logs. > > > > > > Given that this one is in ip_fil_freebsd.c (specific to FreeBSD), it ca= > n be > > > removed. Others require more grokking. > > > > It can go. It was #if 0'd by ea3022cbbd3f5, IIRC in discussion with you a= > t > > the time (2013). > > > > ea3022cbbd3f5 converted timeout(9) to callout(9). This code remained as a= > n > > artifact of the conversion. It was sent by you to me following the ipfilt= > er > > upgrade from 4.1.28 to 5.1.2 earlier that year. > > > > > > -- > > Cheers, > > Cy Schubert > > FreeBSD UNIX: Web: https://FreeBSD.org > > NTP: Web: https://nwtime.org > > > > e^(i*pi)+1=3D0 > > > > > > > > > > > > > --=20 > Jose Luis Duran