From nobody Fri Dec 01 01:14:21 2023 X-Original-To: freebsd-wireless@mlmmj.nyi.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1]) by mlmmj.nyi.freebsd.org (Postfix) with ESMTP id 4ShFTX5qWQz533pX for ; Fri, 1 Dec 2023 01:14:40 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (2048 bits) client-digest SHA256) (Client CN "smtp.gmail.com", Issuer "GTS CA 1D4" (verified OK)) by mx1.freebsd.org (Postfix) with ESMTPS id 4ShFTX480Jz4SyS for ; Fri, 1 Dec 2023 01:14:40 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Authentication-Results: mx1.freebsd.org; none Received: by mail-lf1-x12e.google.com with SMTP id 2adb3069b0e04-50bc57d81f4so2320229e87.2 for ; Thu, 30 Nov 2023 17:14:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701393276; x=1701998076; darn=freebsd.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=A0vJEaLzZIWbUgIp6qaxp3ddTQ7GmdruHcbAWgqR15w=; b=JmFr8bulvHFg3WDCN4HIlrz/MItE4oh5KbSKayAcv/Wt6m23JgXqp883Jptxs69UZ4 i5jTn3Vcc7Wkd2zLK2VSp3ngFJAbnj1QbB+ZUgAkXRKzDWJbaW1lGJxguroHtTaWCL5l gNQJM9oaLWWLmaO+UcgwS1mV6lrbbpTWXnrmSOj26EPjraJrPMtufaO9mBBRAgxufxnl gDotTYJsvZzTrbSkJXrEnKAGavJQ+nvtIxN4GMevXOYlK026+qQ9wXaRZTz7FpVE5Lgj l2NEi2RYqZ8z0muRapN+7M4V92+P2SZ1OWb5LDfkrn9BS3kgzTQHg2BiachOGAokaYRC hhDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701393276; x=1701998076; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=A0vJEaLzZIWbUgIp6qaxp3ddTQ7GmdruHcbAWgqR15w=; b=BGxjRdk3/Qe3MDaiDRBfK/kNuBeNCg2yoyQ99+JBKU09BQY2QT/sRo73si25PMSR9/ crgrAHCVVU74OtXOO7xM8kssJHDGsWL6J56cCXN75eySnelFAftKDddhgE5Jd+F6L0+q 170Gv5TIoECJ6ecU2EXyRo8z7lQ3iv7/WCmiYtBcZEx4er1bhTosk+VI+QdhJxFdyUyW 4zd2k4p5sYJVN3cAvu/JANuP97Sl+5qbhIXdsxv68VcotLV7KdCkfHtgw9pMeKm3Lbtf LssXXZRjtboXpIJfMRW288mzc7CT6GBYnwb0DlR8elnsHBYNPT+yHVVYAd45CIss2boK XAnA== X-Gm-Message-State: AOJu0YwzctKeWHVAB9bEa5ymsbhy3FjhJHhyYHaN9zl+mgDQf67SIjFi 7lI5lmLX10YkxilLQxbZWhq+MlZijprjWTFb7U71IPPn X-Google-Smtp-Source: AGHT+IESRvqZWI2rBLZmGOk696HsyWzAvLiPYecfCnUcquvuIwTycV9VdHA16XgIi56WQKWy+EItBerWaWU3WgOf4HM= X-Received: by 2002:ac2:5f62:0:b0:50b:c9ab:a41c with SMTP id c2-20020ac25f62000000b0050bc9aba41cmr147119lfc.56.1701393275797; Thu, 30 Nov 2023 17:14:35 -0800 (PST) List-Id: Discussions List-Archive: https://lists.freebsd.org/archives/freebsd-wireless List-Help: List-Post: List-Subscribe: List-Unsubscribe: Sender: owner-freebsd-wireless@freebsd.org X-BeenThere: freebsd-wireless@freebsd.org MIME-Version: 1.0 References: <629e3534-705a-4dcc-ad16-edba170da251@app.fastmail.com> In-Reply-To: From: Adrian Chadd Date: Thu, 30 Nov 2023 17:14:21 -0800 Message-ID: Subject: Re: Why newstate handler runs IEEE80211_LOCK/UNLOCK? To: Farhan Khan Cc: freebsd-wireless Content-Type: multipart/alternative; boundary="000000000000d3c6fa060b687d04" X-Spamd-Bar: ---- X-Rspamd-Pre-Result: action=no action; module=replies; Message is reply to one we originated X-Spamd-Result: default: False [-4.00 / 15.00]; REPLY(-4.00)[]; TAGGED_FROM(0.00)[]; ASN(0.00)[asn:15169, ipnet:2a00:1450::/32, country:US] X-Rspamd-Queue-Id: 4ShFTX480Jz4SyS --000000000000d3c6fa060b687d04 Content-Type: text/plain; charset="UTF-8" On Thu, 30 Nov 2023 at 08:10, Farhan Khan wrote: > On Thu, Nov 30, 2023, at 1:24 AM, Adrian Chadd wrote: > > On Wed, 29 Nov 2023 at 22:12, Farhan Khan wrote: > >> Hi all, > >> > >> I'm studying the implementations of net80211 and noticed that in all > newstate handlers the code begins by running IEEE80211_UNLOCK(ic) and ends > with IEEE80211_LOCK(ic). I was not clear on why this was, I would have > expected the opposite order. Also, why not just use the softc device-wide > mutex over one for ieee80211com. Overall, I do not seem to understand the > intent of the unlock and am seeking clarification. > > > > That part of the net80211 locking handling is ... unfortunately unfun. > > Without doing that, there'd be lots of lock order inversion issues and > > sleeping whilst lock held issues (since it's a mutex, not a sleep lock.) > > The newstate code in net80211 at least (now?) runs in a taskqueue, so > > whenever something changes state, it isn't happening in a random > > drivers rx/tx/ioctl path. That way newstate transitions are at > > hopefully serialised and not running in overlapping/concurrent threads. > > I'm still a little unclear here. Why does it inverted UNLOCK first? > Wouldn't that mean the state /can/ change until still be a LOCK first? And, > why not just do a softc-wide lock, why IEEE80211's lock function? But then > there is also a driver softc lock, which confuses me. I'm also not > understanding the double lock mechanism. > > (bz@ please correct me if things have changed :-) ) The only thing that TECHNICALLY should be driving state changes are calls to ieee80211_newstate() which will eventually schedule a newstate taskqueue item or a callout. I forget. Thus all of the state changes are serialised by that single newstate. It's one of tehe reasons why there's those occasional "state change lost" messages - there's no queue of state changes. Just the state change call, some currenat and new state, and then that callout. Changing the upcoming state is protected by the IEEE80211_LOCK mutex. But, it does lock, vap->iv_newstate or whichever method it is, then lock because you can't sleep whilst holding a mutex (eg what happens when you do a firmware command send in the intel drivers that completes via an interrupt / another kernel thread.) Ideally the net80211 lock would just be held across the whole thing - and for the driver net80211 was written for - if_ath - that would've been possible! But the moment sam brought in other drivers for firmware chipsets, this model broke. Lots of drivers, wifi and otherwise do this, especially in their receive packet path. It's quite frankly a terrible code pattern. One that I've had to carefully use. :-) > > > However, since drivers do a /lot/ of potentially sleeping work in the > > newstate path - think all the sleeping that goes on when things wait > > for firmware commands to complete - you can't hold a mutex across those. > > This seems relevant but I did not understand. :/ > You can't hold a mutex and sleep. You need to use a sleep lock (man sx, or man sx_lock, I forget.) If you're still having trouble understanding the freebsd locks, mutexes and net epoch stuff then it may be worthwhile for us to post a quick overview of it. ;) -a --000000000000d3c6fa060b687d04 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


=
On Thu, 30 Nov 2023 at 08:10, Farhan = Khan <farhan@farhan.codes> wrote:
On Thu, Nov 30, 2023, at 1:24 AM, Adrian Chadd wrot= e:
> On Wed, 29 Nov 2023 at 22:12, Farhan Khan <farhan@farhan.codes> = wrote:
>> Hi all,
>>
>> I'm studying the implementations of net80211 and noticed that = in all newstate handlers the code begins by running IEEE80211_UNLOCK(ic) an= d ends with IEEE80211_LOCK(ic). I was not clear on why this was, I would ha= ve expected the opposite order. Also, why not just use the softc device-wid= e mutex over one for ieee80211com. Overall, I do not seem to understand the= intent of the unlock and am seeking clarification.
>
> That part of the net80211 locking handling is ... unfortunately unfun.=
> Without doing that, there'd be lots of lock order inversion issues= and
> sleeping whilst lock held issues (since it's a mutex, not a sleep = lock.)
> The newstate code in net80211 at least (now?) runs in a taskqueue, so =
> whenever something changes state, it isn't happening in a random <= br> > drivers rx/tx/ioctl path. That way newstate transitions are at
> hopefully serialised and not running in overlapping/concurrent threads= .

I'm still a little unclear here. Why does it inverted UNLOCK first? Wou= ldn't that mean the state /can/ change until still be a LOCK first? And= , why not just do a softc-wide lock, why IEEE80211's lock function? But= then there is also a driver softc lock, which confuses me. I'm also no= t understanding the double lock mechanism.


(bz@ please correct me if things have = changed :-) )=C2=A0

The only thing that TECHNICALL= Y should be driving=C2=A0 state changes are calls to ieee80211_newstate() w= hich will eventually schedule a newstate taskqueue item or a callout. I for= get.
Thus all of the state changes are serialised by that single = newstate. It's one of tehe reasons why there's those occasional &qu= ot;state change lost" messages - there's no queue of state changes= . Just the state change call, some currenat and new state, and then that ca= llout.

=C2=A0Changing the upcoming state is protec= ted by the IEEE80211_LOCK mutex.

But, it does lock= , vap->iv_newstate or whichever method it is, then lock because you can&= #39;t sleep whilst holding a mutex (eg what happens when you do a firmware = command send in the intel drivers that completes via an interrupt / another= kernel thread.) Ideally the net80211 lock would just be held across the wh= ole thing - and for the driver net80211 was written for - if_ath - that wou= ld've been possible! But the moment sam brought in other drivers for fi= rmware chipsets, this model broke. Lots of drivers, wifi and otherwise do t= his, especially in their receive packet path. It's quite frankly a terr= ible code pattern. One that I've had to carefully use. :-)

>
> However, since drivers do a /lot/ of potentially sleeping work in the =
> newstate path - think all the sleeping that goes on when things wait <= br> > for firmware commands to complete - you can't hold a mutex across = those.

This seems relevant but I did not understand. :/

<= /div>
You can't hold a mutex and sleep. You need to use a sleep loc= k (man sx, or man sx_lock, I forget.)

If you'r= e still having trouble understanding the freebsd locks, mutexes and net epo= ch stuff then it may be worthwhile for us to post a quick overview of it. ;= )


-a

--000000000000d3c6fa060b687d04--