Re: ieee80211_handle_wake_tx_queue and dynamic ps regression

From: Bryan O'Donoghue
Date: Tue Jan 10 2023 - 07:45:00 EST


+ linux-wireless
On 10/01/2023 12:35, Bryan O'Donoghue wrote:
commit a790cc3a4fad75048295571a350b95b87e022a5a (wake_tx_queue-broken-23-08-01)
Author: Alexander Wetzel <alexander@xxxxxxxxxxxxxx>
Date:   Sun Oct 9 18:30:39 2022 +0200

    wifi: mac80211: add wake_tx_queue callback to drivers

is causing a regression with

- CONF_PS = 1
- CONF_DYNAMIC_PS = 0
- ieee80211_handle_wake_tx_queue

In this case we get stuck in a loop similar to this

// IEEE80211_CONF_CHANGE_PS
[   17.255480] wcn36xx: wcn36xx_change_ps/312 enable
[   18.088835] ieee80211_tx_h_dynamic_ps/263 setting IEEE80211_QUEUE_STOP_REASON_PS
[   18.088906] ieee80211_handle_wake_tx_queue/334 entry
[   18.091505] ieee80211_dynamic_ps_disable_work/2250 calling ieee80211_hw_config()
[   18.095370] ieee80211_handle_wake_tx_queue/338 wake_tx_push_queue

// IEEE80211_CONF_CHANGE_PS
[   18.102625] wcn36xx: wcn36xx_change_ps/312 disable
[   18.107643] wake_tx_push_queue/303 entry

// txq is stopped here reason == IEEE80211_QUEUE_STOP_REASON_PS
[   18.107654] wake_tx_push_queue/311 q_stopped bitmask 0x00000002 IEEE80211_QUEUE_STOP_REASON_PS true
[   18.107661] wake_tx_push_queue/324 exit
[   18.107667] ieee80211_handle_wake_tx_queue/342 exit
[   18.115560] ieee80211_handle_wake_tx_queue/334 entry
[   18.139937] ieee80211_handle_wake_tx_queue/338 wake_tx_push_queue
[   18.145163] wake_tx_push_queue/303 entry
[   18.150016] ieee80211_dynamic_ps_disable_work/2252 completed ieee80211_hw_config()

// now we unset IEEE80211_QUEUE_STOP_REASON_PS but too late
[   18.151145] wake_tx_push_queue/311 q_stopped bitmask 0x00000002 IEEE80211_QUEUE_STOP_REASON_PS true
[   18.155263] ieee80211_dynamic_ps_disable_work/2254 clearing IEEE80211_QUEUE_STOP_REASON_PS
[   18.162531] wake_tx_push_queue/324 exit
[   18.162548] ieee80211_handle_wake_tx_queue/342 exit
[   18.183639] ieee80211_dynamic_ps_disable_work/2259 cleared IEEE80211_QUEUE_STOP_REASON_PS

// IEEE80211_CONF_CHANGE_PS runs again
[   18.215487] wcn36xx: wcn36xx_change_ps/312 enable

We get stuck in that loop. Packets getting transmitted is a rare event, most are dropped.

I tried this as a fix

--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2245,15 +2245,15 @@ void ieee80211_dynamic_ps_disable_work(struct work_struct *work)
                container_of(work, struct ieee80211_local,
                             dynamic_ps_disable_work);

-       if (local->hw.conf.flags & IEEE80211_CONF_PS) {
-               local->hw.conf.flags &= ~IEEE80211_CONF_PS;
-               ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_PS);
-       }
-
        ieee80211_wake_queues_by_reason(&local->hw,
                                        IEEE80211_MAX_QUEUE_MAP,
                                        IEEE80211_QUEUE_STOP_REASON_PS,
                                        false);
+
+       if (local->hw.conf.flags & IEEE80211_CONF_PS) {
+               local->hw.conf.flags &= ~IEEE80211_CONF_PS;
+               ieee80211_hw_config(local, IEEE80211_CONF_CHANGE_PS);
+       }
 }

but it only "slightly improves" the situation, the fundamental race condition is still there.

Suggest reverting this change and trying again.

---
bod