ieee80211_handle_wake_tx_queue and dynamic ps regression

From: Bryan O'Donoghue
Date: Tue Jan 10 2023 - 07:35:42 EST


commit a790cc3a4fad75048295571a350b95b87e022a5a (wake_tx_queue-broken-23-08-01)
Author: Alexander Wetzel <alexander@xxxxxxxxxxxxxx>
Date: Sun Oct 9 18:30:39 2022 +0200

wifi: mac80211: add wake_tx_queue callback to drivers

is causing a regression with

- CONF_PS = 1
- CONF_DYNAMIC_PS = 0
- ieee80211_handle_wake_tx_queue

In this case we get stuck in a loop similar to this

// IEEE80211_CONF_CHANGE_PS
[ 17.255480] wcn36xx: wcn36xx_change_ps/312 enable
[ 18.088835] ieee80211_tx_h_dynamic_ps/263 setting IEEE80211_QUEUE_STOP_REASON_PS
[ 18.088906] ieee80211_handle_wake_tx_queue/334 entry
[ 18.091505] ieee80211_dynamic_ps_disable_work/2250 calling ieee80211_hw_config()
[ 18.095370] ieee80211_handle_wake_tx_queue/338 wake_tx_push_queue

// IEEE80211_CONF_CHANGE_PS
[ 18.102625] wcn36xx: wcn36xx_change_ps/312 disable
[ 18.107643] wake_tx_push_queue/303 entry

// txq is stopped here reason == IEEE80211_QUEUE_STOP_REASON_PS
[ 18.107654] wake_tx_push_queue/311 q_stopped bitmask 0x00000002 IEEE80211_QUEUE_STOP_REASON_PS true
[ 18.107661] wake_tx_push_queue/324 exit
[ 18.107667] ieee80211_handle_wake_tx_queue/342 exit
[ 18.115560] ieee80211_handle_wake_tx_queue/334 entry
[ 18.139937] ieee80211_handle_wake_tx_queue/338 wake_tx_push_queue
[ 18.145163] wake_tx_push_queue/303 entry
[ 18.150016] ieee80211_dynamic_ps_disable_work/2252 completed ieee80211_hw_config()

// now we unset IEEE80211_QUEUE_STOP_REASON_PS but too late
[ 18.151145] wake_tx_push_queue/311 q_stopped bitmask 0x00000002 IEEE80211_QUEUE_STOP_REASON_PS true
[ 18.155263] ieee80211_dynamic_ps_disable_work/2254 clearing IEEE80211_QUEUE_STOP_REASON_PS
[ 18.162531] wake_tx_push_queue/324 exit
[ 18.162548] ieee80211_handle_wake_tx_queue/342 exit
[ 18.183639] ieee80211_dynamic_ps_disable_work/2259 cleared IEEE80211_QUEUE_STOP_REASON_PS

// IEEE80211_CONF_CHANGE_PS runs again
[ 18.215487] wcn36xx: wcn36xx_change_ps/312 enable

We get stuck in that loop. Packets getting transmitted is a rare event, most are dropped.

I tried this as a fix

--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2245,15 +2245,15 @@ void ieee80211_dynamic_ps_disable_work(struct work_struct *work)
container_of(work, struct ieee80211_local,
dynamic_ps_disable_work);

- if (local->hw.conf.flags & IEEE80211_CONF_PS) {
- local->hw.conf.flags &= ~IEEE80211_CONF_PS;
- ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_PS);
- }
-
ieee80211_wake_queues_by_reason(&local->hw,
IEEE80211_MAX_QUEUE_MAP,
IEEE80211_QUEUE_STOP_REASON_PS,
false);
+
+ if (local->hw.conf.flags & IEEE80211_CONF_PS) {
+ local->hw.conf.flags &= ~IEEE80211_CONF_PS;
+ ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_PS);
+ }
}

but it only "slightly improves" the situation, the fundamental race condition is still there.

Suggest reverting this change and trying again.

---
bod