From nobody Tue Jul 16 20:19:04 2024 X-Original-To: dev-commits-src-main@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4WNr4r2WtVz5Q9SS; Tue, 16 Jul 2024 20:19:08 +0000 (UTC) (envelope-from glebius@freebsd.org) Received: from smtp.freebsd.org (smtp.freebsd.org [IPv6:2610:1c1:1:606c::24b:4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "smtp.freebsd.org", Issuer "R10" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4WNr4r209cz49X5; Tue, 16 Jul 2024 20:19:08 +0000 (UTC) (envelope-from glebius@freebsd.org) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721161148; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=kh1a91itd20Zg/y/ZUW2AeWOcBfItErMltGS6TmU2yQ=; b=whAsJzF6FtTZDKelFXWjpHa/Ym+nIfWIvFqpvc5msxIT6gREjj8It/j/T/8Mm5XggQ/8OC 3f8GvUl/IRUc8A2kSsNLYL52uUgod16ZQ5hjEtbI0iLTV5ld+1naeVktm6jg2ivY8UPmnS bZOjtwDa6gniT8BUTn3WQzuYX8sXZZ8/uHi3dV1ibxvSlRw+ZUXhT0T9Y7qq4gTEu+yvlT 540JkWgphBQBPvfLMowGR987Y2K+F265XnWcTBauavTJubsNruUQTbqU6zVcLp90DYSnhM uXrKrGtDDSWBnLu5T55ZISPJu/xQAmZtMw6zEWn4WpyRKQ28RTrqNagq9JYUbA== ARC-Seal: i=1; s=dkim; d=freebsd.org; t=1721161148; a=rsa-sha256; cv=none; b=G7JFT1FIyibSsYlrXvSabUptsbIexXtMTNpOSPVF1V6fbjHkvbRv9bb1br+eMFOP507LbI mdff6Ipmr7qYW0oDzHqaGUqipQg337yj7VTWMv0gGaqiy/m3LtbnU1RzMxpLKEvWwoyKkk LldqT6QteriERttyqyBI/Su9M4mE4fDoVW7smEwGwy0+s8bgUYbS7dx0Mt/93OqeihV0pi 7ALrPjUze2btP1nLIwdX/1dDGh/OfYdWT0WxgPiJnbMXVqmiSPHlAIamrDDFapb2UrlI50 vpr0rqaXuiS672s8Czpo7k/V3jEqxFvqrSOhfz9/Is0K4jSHWw4DCxdixkshvg== ARC-Authentication-Results: i=1; mx1.freebsd.org; none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=freebsd.org; s=dkim; t=1721161148; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=kh1a91itd20Zg/y/ZUW2AeWOcBfItErMltGS6TmU2yQ=; b=VWmfphwF5lNQiG2U/swQ83XK+5SVXwHI/YMZVc5JCLkYDn/6KHTu8lZNsOQfvkSBn6sX5Q xriGsqS29hSRtpuN/rchmqBVcMQW3ZChJrC59Kctjo3QB7ddiYjWqnOf04kFbYtOGQuPFc ociDnkKyPDX/ww65fE+x8z3qTBUInSGb79sBUIO88rYLf4pKud8tow0vyEZ2Rqo/y6/f8M QhVvKTuMLzP4fZemlLSX5lOFXFYqe6NhP5gsL4+7lp7YkOCdz4/56JhXXemfTwCcEKhg41 2rsCvOFbVOYpOepUWLsZkVow1Y4CneYaG9fBeCpEGur5NIE4hvg/I+F7hmdhmg== Received: from cell.glebi.us (glebi.us [162.251.186.162]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (Client did not present a certificate) (Authenticated sender: glebius) by smtp.freebsd.org (Postfix) with ESMTPSA id 4WNr4q2mn8zjJp; Tue, 16 Jul 2024 20:19:07 +0000 (UTC) (envelope-from glebius@freebsd.org) Date: Tue, 16 Jul 2024 13:19:04 -0700 From: Gleb Smirnoff To: Mateusz Guzik Cc: John Baldwin , Mateusz Guzik , src-committers@freebsd.org, dev-commits-src-all@freebsd.org, dev-commits-src-main@freebsd.org Subject: Re: git: 87ee63bac69d - main - locks: add a runtime check for missing turnstile Message-ID: References: <202407111107.46BB7uSb007077@gitrepo.freebsd.org> <81cfe7ad-cbee-4122-abef-e47ce2b34f05@FreeBSD.org> List-Id: Commit messages for the main branch of the src repository List-Archive: https://lists.freebsd.org/archives/dev-commits-src-main List-Help: List-Post: List-Subscribe: List-Unsubscribe: X-BeenThere: dev-commits-src-main@freebsd.org Sender: owner-dev-commits-src-main@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Mateusz, On Mon, Jul 15, 2024 at 07:59:42PM +0200, Mateusz Guzik wrote: M> > > locks: add a runtime check for missing turnstile M> > > M> > > There are sometimes bugs which result in the unlock fast path failing, M> > > which in turns causes a not-helpful crash report when dereferencing a M> > > NULL turnstile. Help debugging such cases by pointing out what happened M> > > along with some debug. M> > > M> > > Sponsored by: Rubicon Communications, LLC ("Netgate") I am quite surprised to see such commit from you. You are known to cut a few instruction branches even pessimizing other people setups (yeah, talking about commits like 5091ca26507b), and now you are adding an absolutely useless branch for literally everyone. The fingerprint of the non-INVARIANTS panic for a destroyed or already unlocked mutex/rwlock is well known to anybody who has been hacking on the FreeBSD kernel for the last 20 years. I am very much surprised it is not familiar to you. The sample panic quoted below just immediately rings the bell for me, and giving feedback on this commit for many people, too. Please revert this. Here is what we have at Netflix and I would suggest you to use same practice at Netgate. If there is a panic/issue that we anticipate at our setup and we want extra debugging to be enabled without INVARIANTS, but we don't want to pollute the open source FreeBSD with our own hacks, we provided NASSERT macro in kassert.h, that works exactly same as KASSERT, but is enabled always. "N" stands for Netflix, but it would equally well work for Netgate ;) So, for a KASSERT that we want to be always enabled, we just flip one character K -> N and of course this diff is super easy to resolve during merges. M> Use-after-free, overflow, underflow, bitflip or what have you all can M> fail the fast path. M> M> Once that happens and the kernel crashes with a null pointer deref, M> here is a crash at netgate which prodded this: M> calltrap() at calltrap+0x8/frame 0xfffffe0106720920 M> --- trap 0xc, rip = 0xffffffff80d5ab70, rsp = 0xfffffe01067209f0, rbp M> = 0xfffffe0106720a00 --- M> turnstile_broadcast() at turnstile_broadcast+0x40/frame 0xfffffe0106720a00 M> __rw_wunlock_hard() at __rw_wunlock_hard+0x9e/frame 0xfffffe0106720a30 M> nd6_resolve_slow() at nd6_resolve_slow+0x2d7/frame 0xfffffe0106720aa0 M> nd6_resolve() at nd6_resolve+0x125/frame 0xfffffe0106720b10 M> ether_output() at ether_output+0x4e7/frame 0xfffffe0106720ba0 M> ip_output_send() at ip_output_send+0xdc/frame 0xfffffe0106720be0 M> ip_output() at ip_output+0x1295/frame 0xfffffe0106720ce0 M> ip_forward() at ip_forward+0x3c2/frame 0xfffffe0106720d90 M> ip_input() at ip_input+0x705/frame 0xfffffe0106720df0 M> swi_net() at swi_net+0x138/frame 0xfffffe0106720e60 M> ithread_loop() at ithread_loop+0x257/frame 0xfffffe0106720ef0 M> fork_exit() at fork_exit+0x7f/frame 0xfffffe0106720f30 M> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0106720f30 M> --- trap 0, rip = 0, rsp = 0, rbp = 0 --- M> M> Neither the register dump nor anything in the backtrace indicate what happened. M> M> Since the kernel is going down anyway, one may as well get some debug from it. -- Gleb Smirnoff