git: e8de31caceaa - main - net80211: Fix traffic hang on STA/AP VAPs on a multi-VAP interface

From: Adrian Chadd <adrian_at_FreeBSD.org>
Date: Fri, 22 Apr 2022 05:49:14 UTC
The branch main has been updated by adrian:

URL: https://cgit.FreeBSD.org/src/commit/?id=e8de31caceaa36caf5d7b4355072f148e2433b82

commit e8de31caceaa36caf5d7b4355072f148e2433b82
Author:     Adrian Chadd <adrian@FreeBSD.org>
AuthorDate: 2022-04-12 20:20:28 +0000
Commit:     Adrian Chadd <adrian@FreeBSD.org>
CommitDate: 2022-04-22 05:49:01 +0000

    net80211: Fix traffic hang on STA/AP VAPs on a multi-VAP interface
    
    This took an embarrasingly long time to find.
    
    The state changes for a radio with a STA /and/ AP VAP gets a bit messy.
    The AP maps are marked as waiting, waiting for the STA AP to find a
    channel to use before the AP VAPs become active.
    
    However, the code path that clears the OACTIVE flag on a VAP only runs
    during a successful run of ieee80211_newstate_cb().
    
    So here is how it goes:
    
    * the STA VAP goes down and needs to scan;
    * the AP vap goes RUN->INIT; but it doesn't YET call ieee80211_newstate_cb();
    * meanwhile - a send on the AP VAP causes the VAP to set the OACTIVE flag here;
    * then the STA VAP finishes scan and goes to RUN;
    * which will call wakeupwaiting() as part of the STA VAP transition to RUN;
    * .. then the AP VAP goes INIT->RUN directly via a call to hostap_newstate
      in wakeupwaiting rather than it being through the deferred path;
    * /then/ the ieee80211_newstate_cb() is called, but it sees the state go
      RUN->RUN;
    * .. which results in the OACTIVE flag never being cleared.
    
    This clears the OACTIVE flag when a VAP transitions RUN->RUN; the
    driver layer or net80211 layer can set it if required in a subsequent
    transmit.
    
    Differential Revision: https://reviews.freebsd.org/D34920
    
    Reviewed by: bz
---
 sys/net80211/ieee80211_proto.c | 47 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/sys/net80211/ieee80211_proto.c b/sys/net80211/ieee80211_proto.c
index 2228983050a2..d2bde99ce79c 100644
--- a/sys/net80211/ieee80211_proto.c
+++ b/sys/net80211/ieee80211_proto.c
@@ -2469,6 +2469,29 @@ wakeupwaiting(struct ieee80211vap *vap0)
 			vap->iv_flags_ext &= ~IEEE80211_FEXT_SCANWAIT;
 			/* NB: sta's cannot go INIT->RUN */
 			/* NB: iv_newstate may drop the lock */
+
+			/*
+			 * This is problematic if the interface has OACTIVE
+			 * set.  Only the deferred ieee80211_newstate_cb()
+			 * will end up actually /clearing/ the OACTIVE
+			 * flag on a state transition to RUN from a non-RUN
+			 * state.
+			 *
+			 * But, we're not actually deferring this callback;
+			 * and when the deferred call occurs it shows up as
+			 * a RUN->RUN transition!  So the flag isn't/wasn't
+			 * cleared!
+			 *
+			 * I'm also not sure if it's correct to actually
+			 * do the transitions here fully through the deferred
+			 * paths either as other things can be invoked as
+			 * part of that state machine.
+			 *
+			 * So just keep this in mind when looking at what
+			 * the markwaiting/wakeupwaiting routines are doing
+			 * and how they invoke vap state changes.
+			 */
+
 			vap->iv_newstate(vap,
 			    vap->iv_opmode == IEEE80211_M_STA ?
 			        IEEE80211_S_SCAN : IEEE80211_S_RUN, 0);
@@ -2543,6 +2566,30 @@ ieee80211_newstate_cb(void *xvap, int npending)
 		goto done;
 	}
 
+	/*
+	 * Handle the case of a RUN->RUN transition occuring when STA + AP
+	 * VAPs occur on the same radio.
+	 *
+	 * The mark and wakeup waiting routines call iv_newstate() directly,
+	 * but they do not end up deferring state changes here.
+	 * Thus, although the VAP newstate method sees a transition
+	 * of RUN->INIT->RUN, the deferred path here only sees a RUN->RUN
+	 * transition.  If OACTIVE is set then it is never cleared.
+	 *
+	 * So, if we're here and the state is RUN, just clear OACTIVE.
+	 * At some point if the markwaiting/wakeupwaiting paths end up
+	 * also invoking the deferred state updates then this will
+	 * be no-op code - and also if OACTIVE is finally retired, it'll
+	 * also be no-op code.
+	 */
+	if (nstate == IEEE80211_S_RUN) {
+		/*
+		 * Unblock the VAP queue; a RUN->RUN state can happen
+		 * on a STA+AP setup on the AP vap.  See wakeupwaiting().
+		 */
+		vap->iv_ifp->if_drv_flags &= ~IFF_DRV_OACTIVE;
+	}
+
 	/* No actual transition, skip post processing */
 	if (ostate == nstate)
 		goto done;