Re: [RFC 03/11] net: phy: refactor c45 phy identification sequence

From: Jeremy Linton
Date: Sun May 24 2020 - 22:38:01 EST


Hi,

On 5/23/20 3:01 PM, Russell King - ARM Linux admin wrote:
On Sat, May 23, 2020 at 09:51:31PM +0200, Andrew Lunn wrote:
static int get_phy_c45_ids(struct mii_bus *bus, int addr, u32 *phy_id,
struct phy_c45_device_ids *c45_ids) {
- int phy_reg;
- int i, reg_addr;
+ int ret;
+ int i;
const int num_ids = ARRAY_SIZE(c45_ids->device_ids);
u32 *devs = &c45_ids->devices_in_package;

I feel a "reverse christmas tree" complaint brewing... yes, the original
code didn't follow it. Maybe a tidy up while touching this code?

At minimum, a patch should not make it worse. ret and i should clearly
be after devs.

static int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id,
bool is_c45, struct phy_c45_device_ids *c45_ids)
{
- int phy_reg;
+ int ret;
if (is_c45)
return get_phy_c45_ids(bus, addr, phy_id, c45_ids);
- /* Grab the bits from PHYIR1, and put them in the upper half */
- phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
- if (phy_reg < 0) {
+ ret = _get_phy_id(bus, addr, 0, phy_id, false);
+ if (ret < 0) {
/* returning -ENODEV doesn't stop bus scanning */
- return (phy_reg == -EIO || phy_reg == -ENODEV) ? -ENODEV : -EIO;
+ return (ret == -EIO || ret == -ENODEV) ? -ENODEV : -EIO;

Since ret will only ever be -EIO here, this can only ever return
-ENODEV, which is a functional change in the code (probably unintended.)
Nevertheless, it's likely introducing a bug if the intention is for
some other return from mdiobus_read() to be handled differently.

}
- *phy_id = phy_reg << 16;
-
- /* Grab the bits from PHYIR2, and put them in the lower half */
- phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
- if (phy_reg < 0)
- return -EIO;

... whereas this one always returns -EIO on any error.

So, I think you have the potential in this patch to introduce a subtle
change of behaviour, which may lead to problems - have you closely
analysed why the code was the way it was, and whether your change of
behaviour is actually valid?

I could be remembering this wrongly, but i think this is to do with
orion_mdio_xsmi_read() returning -ENODEV, not 0xffffffffff, if there
is no device on the bus at the given address. -EIO is fatal to the
scan, everything stops with the assumption the bus is broken. -ENODEV
should not be fatal to the scan.

Maybe orion_mdio_xsmi_read() should be fixed then? Also, maybe
adding return code documentation for mdiobus_read() / mdiobus_write()
would help MDIO driver authors have some consistency in what
errors they are expected to return (does anyone know for certain?)


My understanding at this point (which is mostly based on the xgmac here), is that 0xffffffff is equivalent to "bus responding correctly, phy failed to respond at this register location" while any -Eerror is understood as "something wrong with bus", and the mdio core then makes a choice about terminating just the current phy search (ENODEV), or terminating the entire mdio bus (basically everything else) registration.

I will see about clarifying the docs. I need to see if that will end up being a bit of a rabbit hole before committing to including that in this set.

Which brings up the problem that at least xgmac_mdio doesn't appear to handle being told "your bus registration failed" without OOPSing the probe routine. I think Calvin is aware of this, and I believe he has some additional xgmac/etc patches on top of this set. Although he pinged me offline the other day to say that apparently all my hunk shuffling broke some of the c45 phy detection I had working earlier in the week.