[Bug 271979] bsdinstall(8): iwlwifi(4): system crash when authenticating for Wi-Fi: panic: lkpi_sta_auth_to_scan: lsta 0x... state not NONE: 0, nstate 1 arg 1
- Go to: [ bottom of page ] [ top of archives ] [ this month ]
Date: Thu, 09 Nov 2023 23:45:29 UTC
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271979 --- Comment #29 from Bjoern A. Zeeb <bz@FreeBSD.org> --- (In reply to Cheng Cui from comment #28) Ok, based on the wlandebug +state and the additional logging from [1] here's the race in net80211 affecting possibly all drivers: [1] https://people.freebsd.org/~bz/wireless/20231109-01-net80211-newstate-logging.diff >>>> ANNOTATED OUTPUT from Comment 28 [https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=271979#c28]; hope we can confirm this with time stamps on. wlan0: sta_newstate: INIT -> SCAN (0) Starting dhclient. wlan0: no link ..... [1] wlan0: ieee80211_new_state_locked:2746: starting state update SCAN -> SCAN (AUTH) [1] wlan0: ieee80211_new_state_locked: SCAN -> AUTH (arg 192) (nrunning 0 nscanning 0) [1] wlan0: ieee80211_newstate_cb:2517: running state update SCAN -> AUTH (1) [1] wlan0: ieee80211_newstate_cb: SCAN -> AUTH arg 192 LinuxKPI running lkpi_sta_scan_to_auth() around here... wlan0: [f4:69:42:57:3f:0e] station assoc via MLME ioctl logging, triggering ieee80211_sta_join -> ieee80211_sta_join1 -> ieee80211_new_state(vap, AUTH, IEEE80211_FC0_SUBTYPE_DEAUTH=192) -> [2] ieee80211_sta_join() would allocate a new node (ni) and lsta in LinuxKPI. ieee80211_sta_join1() would then call iv_update_bss() and that would swap nodes. That explains the previous error Colin saw with the queue not having the valid node anymore and also explains why we later panic as the state is not correct anymore. If the assumption is correct a KASSERT in iv_update_bss() could probably catch this. I'll post a patch for that as well. I have a big XXX in that code anyway because of this. [2] wlan0: ieee80211_new_state_locked:2731: pending SCAN -> AUTH (now to AUTH) transition lost [2] wlan0: ieee80211_new_state_locked:2746: starting state update SCAN -> AUTH (AUTH) [2] wlan0: ieee80211_new_state_locked: SCAN -> AUTH (arg 192) (nrunning 0 nscanning 0) LinuxKPI calls into the original handler for [1] which means lkpi_sta_scan_to_auth() is done: [1] wlan0: sta_newstate: SCAN -> AUTH (192) here iv_state gets updated from SCAN to AUTH, [2] wlan0: ieee80211_newstate_cb:2517: running state update AUTH -> AUTH (1) [2] wlan0: ieee80211_newstate_cb: AUTH -> AUTH arg 192 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The [2] SCAN -> AUTH turned into AUTH -> AUTH; a further (LinuxKPI runs lkpi_sta_a_to_a() possibly mgmt protection problem given our sta_to_auth has not finished yet -- if we had a reply and moved to assoc; cc@ to test, which would explain the PR in the next line): Invalid TXQ idiwl_mvm_tx_mpdu:1204: fc 0x00b0 tid 8 txq_id 65535 mvm 0xfffffe00b1250408 skb 0xfffff80007865800 { len 30 } info 0xfffffe00745dcce8 sta 0xfffff80005760880 (if you see this please report to PR 274382) wlan0: ni 0xfffffe00b15bf000 vap 0xfffffe00b12e0010 mode STA state AUTH m 0xfffff800078b1b00 status 4543576 wlan0: ni 0xfffffe00b15bf000 mode STA state AUTH arg 0x2 status 4543576 [2] wlan0: sta_newstate: AUTH -> AUTH (192) should call sta_authretry(, with 192 >> 8 == 0 == IEEE80211_STATUS_SUCCESS) -> Sends another b0 (authentication). wlan0: ni 0xfffffe00b15bf000 vap 0xfffffe00b12e0010 mode STA state AUTH m 0xfffff8000773cb00 status 1 -- You are receiving this mail because: You are on the CC list for the bug.