[PATCH net v6 3/3] net: sched: fix tx action reschedule issue with stopped queue

From: Yunsheng Lin
Date: Sun May 09 2021 - 21:42:52 EST


The netdev qeueue might be stopped when byte queue limit has
reached or tx hw ring is full, net_tx_action() may still be
rescheduled endlessly if STATE_MISSED is set, which consumes
a lot of cpu without dequeuing and transmiting any skb because
the netdev queue is stopped, see qdisc_run_end().

This patch fixes it by checking the netdev queue state before
calling qdisc_run() and clearing STATE_MISSED if netdev queue is
stopped during qdisc_run(), the net_tx_action() is recheduled
again when netdev qeueue is restarted, see netif_tx_wake_queue().

Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
Reported-by: Michal Kubecek <mkubecek@xxxxxxx>
Signed-off-by: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
---
V6: Drop NET_XMIT_DROP checking for it is not really relevant
to this patch, and it may cause performance performance
regression with multi pktgen threads on dummy netdev with
pfifo_fast qdisc case.
---
net/core/dev.c | 3 ++-
net/sched/sch_generic.c | 8 +++++++-
2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index d596cd7..ef8cf76 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3853,7 +3853,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,

if (q->flags & TCQ_F_NOLOCK) {
rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
- qdisc_run(q);
+ if (likely(!netif_xmit_frozen_or_stopped(txq)))
+ qdisc_run(q);

if (unlikely(to_free))
kfree_skb_list(to_free);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d86c4cc..37403c1 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -74,6 +74,7 @@ static inline struct sk_buff *__skb_dequeue_bad_txq(struct Qdisc *q)
}
} else {
skb = SKB_XOFF_MAGIC;
+ clear_bit(__QDISC_STATE_MISSED, &q->state);
}
}

@@ -242,6 +243,7 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate,
}
} else {
skb = NULL;
+ clear_bit(__QDISC_STATE_MISSED, &q->state);
}
if (lock)
spin_unlock(lock);
@@ -251,8 +253,10 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate,
*validate = true;

if ((q->flags & TCQ_F_ONETXQUEUE) &&
- netif_xmit_frozen_or_stopped(txq))
+ netif_xmit_frozen_or_stopped(txq)) {
+ clear_bit(__QDISC_STATE_MISSED, &q->state);
return skb;
+ }

skb = qdisc_dequeue_skb_bad_txq(q);
if (unlikely(skb)) {
@@ -311,6 +315,8 @@ bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
HARD_TX_LOCK(dev, txq, smp_processor_id());
if (!netif_xmit_frozen_or_stopped(txq))
skb = dev_hard_start_xmit(skb, dev, txq, &ret);
+ else
+ clear_bit(__QDISC_STATE_MISSED, &q->state);

HARD_TX_UNLOCK(dev, txq);
} else {
--
2.7.4