Re: [PATCH net] net/sched: fix netdevice reference leaks in attach_one_default_qdisc()

From: wanghai (M)
Date: Fri Aug 19 2022 - 12:19:44 EST



在 2022/8/19 1:56, Jakub Kicinski 写道:
On Wed, 17 Aug 2022 18:46:46 +0800 Wang Hai wrote:
In attach_default_qdiscs(), when attach default qdisc (fq_codel) fails
and fallback to noqueue, if the original attached qdisc is not released
and a new one is directly attached, this will cause netdevice reference
leaks.
Could you provide more details on the failure path? My preference would
be to try to clean up properly there, if possible.
Hi Jakub.

Here are the details of the failure. Do I need to do cleanup under the failed path?

If a dev has multiple queues and queue 0 fails to attach qdisc
because there is no memory in attach_one_default_qdisc(). Then
dev->qdisc will be noop_qdisc by default. But the other queues
may be able to successfully attach to default qdisc.

In this case, the fallback to noqueue process will be triggered

static void attach_default_qdiscs(struct net_device *dev)
{
    ...
    if (!netif_is_multiqueue(dev) ||
        dev->priv_flags & IFF_NO_QUEUE) {
            ...
            netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL); // queue 0 attach failed because -ENOBUFS, but the other queues attach successfully
            qdisc = txq->qdisc_sleeping;
            rcu_assign_pointer(dev->qdisc, qdisc); // dev->qdisc = &noop_qdisc
            ...
    }
    ...
    if (qdisc == &noop_qdisc) {
        ...
        netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL); // Re-attach, but not release the previously created qdisc
        ...
    }
}

The following is the bug log:

veth0: default qdisc (fq_codel) fail, fallback to noqueue
unregister_netdevice: waiting for veth0 to become free. Usage count = 32
leaked reference.
qdisc_alloc+0x12e/0x210
qdisc_create_dflt+0x62/0x140
attach_one_default_qdisc.constprop.41+0x44/0x70
dev_activate+0x128/0x290
__dev_open+0x12a/0x190
__dev_change_flags+0x1a2/0x1f0
dev_change_flags+0x23/0x60
do_setlink+0x332/0x1150
__rtnl_newlink+0x52f/0x8e0
rtnl_newlink+0x43/0x70
rtnetlink_rcv_msg+0x140/0x3b0
netlink_rcv_skb+0x50/0x100
netlink_unicast+0x1bb/0x290
netlink_sendmsg+0x37c/0x4e0
sock_sendmsg+0x5f/0x70
____sys_sendmsg+0x208/0x280

In attach_one_default_qdisc(), release the old one before attaching
a new qdisc to fix this bug.

Fixes: bf6dba76d278 ("net: sched: fallback to qdisc noqueue if default qdisc setup fail")
Signed-off-by: Wang Hai <wanghai38@xxxxxxxxxx>
---
net/sched/sch_generic.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d47b9689eba6..87b61ef14497 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -1140,6 +1140,11 @@ static void attach_one_default_qdisc(struct net_device *dev,
if (!netif_is_multiqueue(dev))
qdisc->flags |= TCQ_F_ONETXQUEUE | TCQ_F_NOPARENT;
+
+ if (dev_queue->qdisc_sleeping &&
+ dev_queue->qdisc_sleeping != &noop_qdisc)
+ qdisc_put(dev_queue->qdisc_sleeping);
+
dev_queue->qdisc_sleeping = qdisc;
}
.

--
Wang Hai