Re: [PATCH net-next v3 2/7] net: lan966x: Split lan966x_fdb_event_work
From: Horatiu Vultur
Date: Tue Jul 05 2022 - 17:55:35 EST
The 07/02/2022 14:08, Vladimir Oltean wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Fri, Jul 01, 2022 at 10:52:22PM +0200, Horatiu Vultur wrote:
> > Split the function lan966x_fdb_event_work. One case for when the
> > orig_dev is a bridge and one case when orig_dev is lan966x port.
> > This is preparation for lag support. There is no functional change.
> >
> > Signed-off-by: Horatiu Vultur <horatiu.vultur@xxxxxxxxxxxxx>
> > ---
>
> > -static void lan966x_fdb_event_work(struct work_struct *work)
> > +void lan966x_fdb_flush_workqueue(struct lan966x *lan966x)
> > +{
> > + flush_workqueue(lan966x->fdb_work);
> > +}
> > +
>
> > diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> > index df2bee678559..d9fc6a9a3da1 100644
> > --- a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> > +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c
> > @@ -320,9 +320,10 @@ static int lan966x_port_prechangeupper(struct net_device *dev,
> > {
> > struct lan966x_port *port = netdev_priv(dev);
> >
> > - if (netif_is_bridge_master(info->upper_dev) && !info->linking)
> > - switchdev_bridge_port_unoffload(port->dev, port,
> > - NULL, NULL);
> > + if (netif_is_bridge_master(info->upper_dev) && !info->linking) {
> > + switchdev_bridge_port_unoffload(port->dev, port, NULL, NULL);
> > + lan966x_fdb_flush_workqueue(port->lan966x);
> > + }
>
> Very curious as to why you decided to stuff this change in here.
> There was no functional change in v2, now there is. And it's a change
> you might need to come back to later (probably sooner than you'd like),
> since the flushing of the workqueue is susceptible to causing deadlocks
> if done improperly - let's see how you blame a commit that was only
> supposed to move code, in that case ;)
There is a functional change here and I forgot to change the commit
message for this.
>
> The deadlock that I'm talking about comes from the fact that
> lan966x_port_prechangeupper() runs with rtnl_lock() held. So the code of
> the flushed workqueue item must not hold rtnl_lock(), or any other lock
> that is blocked by the rtnl_lock(). Otherwise, the flushing will wait
> for a workqueue item to complete, that in turn waits to acquire the
> rtnl_lock, which is held by the thread waiting the workqueue to complete.
>
> Analyzing your code, lan966x_mac_notifiers() takes rtnl_lock().
> That is taken from threaded interrupt context - lan966x_mac_irq_process(),
> but is a sub-lock of spin_lock(&lan966x->mac_lock).
>
> There are 2 problems with that already: rtnl_lock() is a mutex => can
> sleep, but &lan966x->mac_lock is a spin lock => is atomic. You can't
> take rtnl_lock() from atomic context. Lockdep and/or CONFIG_DEBUG_ATOMIC_SLEEP
> will tell you so much.
>
> The second problem is the lock ordering inversion that this causes.
> There exists a threaded IRQ which takes the locks in the order mac_lock
> -> rtnl_lock, and there exists this new fdb_flush_workqueue which takes
> the locks in the order rtnl_lock -> mac_lock. If they run at the same
> time, kaboom. Again, lockdep will tell you as much.
>
> I'm sorry, but you need to solve the existing locking problems with the
> code first.
As I see it, there 2 'different problems' which both have the same root
cause, the usage of the lan966x->mac_lock:
1. One is with lan966x_mac_notifiers and lan966x_mac_irq_process, which
is an issue on net. And this needs a separate patch.
2. Second is introduced by flushing the workqueue.
I am pretty sure I have run with CONFIG_DEBUG_ATOMIC_SLEEP but I
couldn't see any errors/warnings.
So let me start by fixing first issue on net.
>
> >
> > return NOTIFY_DONE;
> > }
> > --
> > 2.33.0
> >
--
/Horatiu