Re: [PATCH net] net/sched: fix netdevice reference leaks in attach_one_default_qdisc()

From: wanghai (M)
Date: Thu Aug 25 2022 - 08:29:31 EST



在 2022/8/19 23:58, wanghai (M) 写道:

在 2022/8/19 1:56, Jakub Kicinski 写道:
On Wed, 17 Aug 2022 18:46:46 +0800 Wang Hai wrote:
In attach_default_qdiscs(), when attach default qdisc (fq_codel) fails
and fallback to noqueue, if the original attached qdisc is not released
and a new one is directly attached, this will cause netdevice reference
leaks.
Could you provide more details on the failure path? My preference would
be to try to clean up properly there, if possible.
Hi Jakub.

Here are the details of the failure. Do I need to do cleanup under the failed path?

If a dev has multiple queues and queue 0 fails to attach qdisc
because there is no memory in attach_one_default_qdisc(). Then
dev->qdisc will be noop_qdisc by default. But the other queues
may be able to successfully attach to default qdisc.

In this case, the fallback to noqueue process will be triggered

static void attach_default_qdiscs(struct net_device *dev)
{
    ...
    if (!netif_is_multiqueue(dev) ||
        dev->priv_flags & IFF_NO_QUEUE) {
            ...
            netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL); // queue 0 attach failed because -ENOBUFS, but the other queues attach successfully
            qdisc = txq->qdisc_sleeping;
            rcu_assign_pointer(dev->qdisc, qdisc); // dev->qdisc = &noop_qdisc
            ...
    }
    ...
    if (qdisc == &noop_qdisc) {
        ...
        netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL); // Re-attach, but not release the previously created qdisc
        ...
    }
}

Hi Jakub.
Do you have any other suggestions for this patch? Any replies would be appreciated.

The following is the bug log:

veth0: default qdisc (fq_codel) fail, fallback to noqueue
unregister_netdevice: waiting for veth0 to become free. Usage count = 32
leaked reference.
  qdisc_alloc+0x12e/0x210
  qdisc_create_dflt+0x62/0x140
  attach_one_default_qdisc.constprop.41+0x44/0x70
  dev_activate+0x128/0x290
  __dev_open+0x12a/0x190
  __dev_change_flags+0x1a2/0x1f0
  dev_change_flags+0x23/0x60
  do_setlink+0x332/0x1150
  __rtnl_newlink+0x52f/0x8e0
  rtnl_newlink+0x43/0x70
  rtnetlink_rcv_msg+0x140/0x3b0
  netlink_rcv_skb+0x50/0x100
  netlink_unicast+0x1bb/0x290
  netlink_sendmsg+0x37c/0x4e0
  sock_sendmsg+0x5f/0x70
  ____sys_sendmsg+0x208/0x280

In attach_one_default_qdisc(), release the old one before attaching
a new qdisc to fix this bug.

Fixes: bf6dba76d278 ("net: sched: fallback to qdisc noqueue if default qdisc setup fail")
Signed-off-by: Wang Hai <wanghai38@xxxxxxxxxx>
---
  net/sched/sch_generic.c | 5 +++++
  1 file changed, 5 insertions(+)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d47b9689eba6..87b61ef14497 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1140,6 +1140,11 @@ static void attach_one_default_qdisc(struct net_device *dev,
        if (!netif_is_multiqueue(dev))
          qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
+
+    if (dev_queue->qdisc_sleeping &&
+        dev_queue->qdisc_sleeping != &noop_qdisc)
+        qdisc_put(dev_queue->qdisc_sleeping);
+
      dev_queue->qdisc_sleeping = qdisc;
  }
.

--
Wang Hai