Re: Why newstate handler runs IEEE80211_LOCK/UNLOCK?

From: Adrian Chadd <adrian.chadd_at_gmail.com>
Date: Fri, 01 Dec 2023 01:14:21 UTC
On Thu, 30 Nov 2023 at 08:10, Farhan Khan <farhan@farhan.codes> wrote:

> On Thu, Nov 30, 2023, at 1:24 AM, Adrian Chadd wrote:
> > On Wed, 29 Nov 2023 at 22:12, Farhan Khan <farhan@farhan.codes> wrote:
> >> Hi all,
> >>
> >> I'm studying the implementations of net80211 and noticed that in all
> newstate handlers the code begins by running IEEE80211_UNLOCK(ic) and ends
> with IEEE80211_LOCK(ic). I was not clear on why this was, I would have
> expected the opposite order. Also, why not just use the softc device-wide
> mutex over one for ieee80211com. Overall, I do not seem to understand the
> intent of the unlock and am seeking clarification.
> >
> > That part of the net80211 locking handling is ... unfortunately unfun.
> > Without doing that, there'd be lots of lock order inversion issues and
> > sleeping whilst lock held issues (since it's a mutex, not a sleep lock.)
> > The newstate code in net80211 at least (now?) runs in a taskqueue, so
> > whenever something changes state, it isn't happening in a random
> > drivers rx/tx/ioctl path. That way newstate transitions are at
> > hopefully serialised and not running in overlapping/concurrent threads.
>
> I'm still a little unclear here. Why does it inverted UNLOCK first?
> Wouldn't that mean the state /can/ change until still be a LOCK first? And,
> why not just do a softc-wide lock, why IEEE80211's lock function? But then
> there is also a driver softc lock, which confuses me. I'm also not
> understanding the double lock mechanism.
>
>
(bz@ please correct me if things have changed :-) )

The only thing that TECHNICALLY should be driving  state changes are calls
to ieee80211_newstate() which will eventually schedule a newstate taskqueue
item or a callout. I forget.
Thus all of the state changes are serialised by that single newstate. It's
one of tehe reasons why there's those occasional "state change lost"
messages - there's no queue of state changes. Just the state change call,
some currenat and new state, and then that callout.

 Changing the upcoming state is protected by the IEEE80211_LOCK mutex.

But, it does lock, vap->iv_newstate or whichever method it is, then lock
because you can't sleep whilst holding a mutex (eg what happens when you do
a firmware command send in the intel drivers that completes via an
interrupt / another kernel thread.) Ideally the net80211 lock would just be
held across the whole thing - and for the driver net80211 was written for -
if_ath - that would've been possible! But the moment sam brought in other
drivers for firmware chipsets, this model broke. Lots of drivers, wifi and
otherwise do this, especially in their receive packet path. It's quite
frankly a terrible code pattern. One that I've had to carefully use. :-)


>
> > However, since drivers do a /lot/ of potentially sleeping work in the
> > newstate path - think all the sleeping that goes on when things wait
> > for firmware commands to complete - you can't hold a mutex across those.
>
> This seems relevant but I did not understand. :/
>

You can't hold a mutex and sleep. You need to use a sleep lock (man sx, or
man sx_lock, I forget.)

If you're still having trouble understanding the freebsd locks, mutexes and
net epoch stuff then it may be worthwhile for us to post a quick overview
of it. ;)


-a