Re: [PATCH net] bonding: switch bond_miimon_inspect to rtnl lock
From: Stanislav Fomichev
Date: Mon Jun 16 2025 - 19:21:27 EST
On 06/16, Jay Vosburgh wrote:
> Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote:
>
> >Syzkaller reports the following issue:
> >
> > RTNL: assertion failed at ./include/net/netdev_lock.h (72)
> > WARNING: CPU: 0 PID: 1141 at ./include/net/netdev_lock.h:72 netdev_ops_assert_locked include/net/netdev_lock.h:72 [inline]
> > WARNING: CPU: 0 PID: 1141 at ./include/net/netdev_lock.h:72 __linkwatch_sync_dev+0x1ed/0x230 net/core/link_watch.c:279
> >
> > ethtool_op_get_link+0x1d/0x70 net/ethtool/ioctl.c:63
> > bond_check_dev_link+0x3f9/0x710 drivers/net/bonding/bond_main.c:863
> > bond_miimon_inspect drivers/net/bonding/bond_main.c:2745 [inline]
> > bond_mii_monitor+0x3c0/0x2dc0 drivers/net/bonding/bond_main.c:2967
> > process_one_work+0x9cf/0x1b70 kernel/workqueue.c:3238
> > process_scheduled_works kernel/workqueue.c:3321 [inline]
> > worker_thread+0x6c8/0xf10 kernel/workqueue.c:3402
> > kthread+0x3c5/0x780 kernel/kthread.c:464
> > ret_from_fork+0x5d4/0x6f0 arch/x86/kernel/process.c:148
> > ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> >
> >As discussed in [0], the report is a bit bogus, but it exposes
> >the fact that bond_miimon_inspect might sleep while its being
> >called under RCU read lock. Convert bond_miimon_inspect callers
> >(bond_mii_monitor) to rtnl lock.
>
> Sorry, I missed the discussion on this last week. This is on
> me, last year this came up and the correct fix is to remove all of the
> obsolete use_carrier logic in bonding. A round trip on RTNL for every
> miimon pass is not realistic.
>
> I've got the following patch building as we speak, if it doesn't
> blow up I'll post it for real.
>
> Actually, reading the patch now as I write, I need to tweak the
> option setting logic, it should permit setting use_carrier to "on" or 1,
> but nothing else. I had originally planned to permit setting it to
> anything and ignore the value, but decided later that turning it off
> should fail, as the behavior change implied by "off" won't happen.
That's even better, thanks, will take a look!