Re: [RFC PATCH net v2] net: phy: Fix the issue that netif always links up after resuming

From: Kunihiko Hayashi
Date: Sun Dec 02 2018 - 23:35:21 EST


Hi Florian Heiner,

On Fri, 30 Nov 2018 19:47:37 +0100 <hkallweit1@xxxxxxxxx> wrote:

> On 30.11.2018 18:46, Florian Fainelli wrote:
> >
> >
> > On 11/30/2018 1:25 AM, Kunihiko Hayashi wrote:
> >> Even though the link is down before entering hibernation,
> >> there is an issue that the network interface always links up after resuming
> >> from hibernation.
> >>
> >> The phydev->state is PHY_READY before enabling the network interface, so
> >> the link is down. After resuming from hibernation, the phydev->state is
> >> forcibly set to PHY_UP in mdio_bus_phy_restore(), and the link becomes up.
> >>
> >> This patch adds a new convenient function to check whether the PHY is in
> >> a started state, and expects to solve the issue by changing phydev->state
> >> to PHY_UP and calling phy_start_machine() only when the PHY is started.
> >>
> >> Suggested-by: Heiner Kallweit <hkallweit1@xxxxxxxxx>
> >> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@xxxxxxxxxxxxx>
> >> ---
> >> drivers/net/phy/phy.c | 2 +-
> >> drivers/net/phy/phy_device.c | 9 +++++----
> >> include/linux/phy.h | 10 ++++++++++
> >> 3 files changed, 16 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> >> index 1d73ac3..f484d03 100644
> >> --- a/drivers/net/phy/phy.c
> >> +++ b/drivers/net/phy/phy.c
> >> @@ -670,7 +670,7 @@ void phy_stop_machine(struct phy_device *phydev)
> >> cancel_delayed_work_sync(&phydev->state_queue);
> >>
> >> mutex_lock(&phydev->lock);
> >> - if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
> >> + if (phy_is_started(phydev))
> >> phydev->state = PHY_UP;
> >> mutex_unlock(&phydev->lock);
> >> }
> >> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> >> index ab33d17..2c39717 100644
> >> --- a/drivers/net/phy/phy_device.c
> >> +++ b/drivers/net/phy/phy_device.c
> >> @@ -309,10 +309,11 @@ static int mdio_bus_phy_restore(struct device *dev)
> >> return ret;
> >>
> >> /* The PHY needs to renegotiate. */
> >> - phydev->link = 0;
> >> - phydev->state = PHY_UP;
> >> -
> >> - phy_start_machine(phydev);
> >> + if (phy_is_started(phydev)) {
> >> + phydev->link = 0;
> >> + phydev->state = PHY_UP;
> >> + phy_start_machine(phydev);
> >> + }
> >
> > Don't you need some of these steps to be performed under phydev->lock
> > being held? See comment below.
> >
> Yes, locking should be done. The old code sets phydev->state
> w/o holding the lock, I'd says this was wrong.

Indeed. The phydev->state should be set with locking the mutex even here.

And it seems that setting phydev->link and calling phy_start_machine() don't
need to hold the lock. Is it correct?

> >>
> >> return 0;
> >> }
> >> diff --git a/include/linux/phy.h b/include/linux/phy.h
> >> index 3ea87f7..c194b45 100644
> >> --- a/include/linux/phy.h
> >> +++ b/include/linux/phy.h
> >> @@ -898,6 +898,16 @@ static inline bool phy_is_pseudo_fixed_link(struct phy_device *phydev)
> >> }
> >>
> >> /**
> >> + * phy_is_started - Convenience function for testing whether a PHY is in
> >> + * a started state
> >> + * @phydev: the phy_device struct
> >> + */
> >> +static inline bool phy_is_started(struct phy_device *phydev)
> >> +{
> >
> > An assert with the phydev->lock mutex being held here would greatly
> > help, because otherwise this is possibly racy.
> >
> Have a look at __phy_resume() to see what is meant with this comment.

I see. I found that there was a lock detection in this function.
The phy_is_started() should have the same detection, shouldn't it?

Thank you,

> >> + return phydev->state >= PHY_UP && phydev->state != PHY_HALTED;
> >> +}
> >> +
> >> +/**
> >> * phy_write_mmd - Convenience function for writing a register
> >> * on an MMD on a given PHY.
> >> * @phydev: The phy_device struct
> >>
> >

---
Best Regards,
Kunihiko Hayashi