Re: WARNING: ODEBUG bug in netdev_freemem (2)

From: Dmitry Vyukov
Date: Mon Jun 24 2019 - 08:22:15 EST


On Mon, Jun 24, 2019 at 2:08 PM Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
> On 6/24/19 3:54 AM, Dmitry Vyukov wrote:
> > On Mon, Jun 24, 2019 at 11:34 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >>
> >> On Mon, 24 Jun 2019, syzbot wrote:
> >>
> >>> Hello,
> >>>
> >>> syzbot found the following crash on:
> >>>
> >>> HEAD commit: fd6b99fa Merge branch 'akpm' (patches from Andrew)
> >>> git tree: upstream
> >>> console output: https://syzkaller.appspot.com/x/log.txt?x=144de256a00000
> >>> kernel config: https://syzkaller.appspot.com/x/.config?x=fa9f7e1b6a8bb586
> >>> dashboard link: https://syzkaller.appspot.com/bug?extid=c4521ac872a4ccc3afec
> >>> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> >>>
> >>> Unfortunately, I don't have any reproducer for this crash yet.
> >>>
> >>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> >>> Reported-by: syzbot+c4521ac872a4ccc3afec@xxxxxxxxxxxxxxxxxxxxxxxxx
> >>>
> >>> device hsr_slave_0 left promiscuous mode
> >>> team0 (unregistering): Port device team_slave_1 removed
> >>> team0 (unregistering): Port device team_slave_0 removed
> >>> bond0 (unregistering): Releasing backup interface bond_slave_1
> >>> bond0 (unregistering): Releasing backup interface bond_slave_0
> >>> bond0 (unregistering): Released>
>
> all slaves
> >>> ------------[ cut here ]------------
> >>> ODEBUG: free active (active state 0) object type: timer_list hint:
> >>> delayed_work_timer_fn+0x0/0x90 arch/x86/include/asm/paravirt.h:767
> >>
> >> One of the cleaned up devices has left an active timer which belongs to a
> >> delayed work. That's all I can decode out of that splat. :(
> >
> > Hi Thomas,
> >
> > If ODEBUG would memorize full stack traces for object allocation
> > (using lib/stackdepot.c), it would make this splat actionable, right?
> > I've fixed https://bugzilla.kernel.org/show_bug.cgi?id=203969 for this.
> >
>
> Not sure this would help in this case as some netdev are allocated through a generic helper.
>
> The driver specific portion might not show up in the stack trace.
>
> It would be nice here to get the work queue function pointer,
> so that it gives us a clue which driver needs a fix.

I see. But isn't the workqueue callback is cleanup_net in this case
and is in the stack?

cleanup_net+0x3fb/0x960 net/core/net_namespace.c:553
process_one_work+0x989/0x1790 kernel/workqueue.c:2269