Re: [RFC net-next 6/8] net: phylink: Configure MAC/PCS when link is up without PHY

From: Russell King - ARM Linux admin
Date: Fri Feb 07 2020 - 06:21:56 EST


On Wed, Feb 05, 2020 at 12:27:33PM +0000, Russell King - ARM Linux admin wrote:
> On Tue, Feb 04, 2020 at 07:32:30PM +0000, Russell King - ARM Linux admin wrote:
> > On Tue, Feb 04, 2020 at 06:43:18PM +0100, Andrew Lunn wrote:
> > > > There, there is one MAC, but there are multiple different PCS - one
> > > > for SGMII and 1000base-X, another for 10G, another for 25G, etc.
> > > > These PCS are accessed via a MDIO adapter embedded in each of the
> > > > MAC hardware blocks.
> > >
> > > Hi Russell
> > >
> > > Marvell mv88e6390X switches are like this is a well. There is a PCS
> > > for SGMII and 1000Base-X, and a second one for 10G. And it dynamically
> > > swaps between them depending on the port mode, the so called cmode.
> > >
> > > So a generic solution is required, and please take your time to build
> > > one.
> >
> > Well, DSA is quite a mixed bag...
> >
> > As far as I can work out, the situation with the CPU and DSA ports is
> > quite hopeless - you've claimed that a change in phylink has broken it,
> > I can't find what that may be. The fact is, phylink has never had any
> > link information for DSA links when no fixed-link property has been
> > specified in DT. As I've already said in a previous email about this,
> > I can't see *any* sane way to fix that - but there was no response.
> >
> >
> > On a more positive note...
> >
> > The mac_link_up() changes that I've talked about should work for DSA,
> > if only there was a reasonable way to reconfigure the ports. If you
> > look at the "phy" branch, you will notice that there's a patch there -
> > "net: mv88e6xxx: use resolved link config in mac_link_up()" which adds
> > the support to configure the MAC manually. It's rather messy, and I
> > see no way to deal with the pause settings. There is support in some
> > Marvell DSA switches to force flow control but that's not supported
> > through the current mid-layer at all (port_set_pause doesn't do it.)
> > I'm not sure whether the "mv88e6xxx_phy_is_internal()" check there is
> > the right test for every DSA switch correct either.
> >
> > What is missing is reading the results from the PCS (aka serdes) and
> > forwarding them into phylink - I did have a quick look at how that might
> > be possible, but the DSA code structure (consisting of multiple
> > mid-layers) makes it hard without rewriting quite a lot of code. That's
> > fine if you know all the DSA chips inside out, but I don't - and that's
> > where we need someone who has the knowledge of all DSA switches that we
> > support. Or, we get rid of the multiple mid-layers and switch to a
> > library approach, so that we can modify support for one DSA switch
> > without affecting everything. It may be a simple matter of dropping the
> > existing serdes workaround, but I'm not sure at the moment.
> >
> > I've tried this code out on the ZII rev B, I haven't tried it on the rev
> > C which has the 6390 switches yet.
>
> Well, it seems GPIO hogging with the sx1503q (for the 3310 PHY, which
> is a local change) has broken sometime between v4.20 and v5.5, which
> prevents the sx1503q driver probing:
>
> [ 23.378706] gpio gpiochip7: (sx1503q): setup of own GPIO 10g-rstn failed
> [ 23.394858] requesting hog GPIO 10g-rstn (chip sx1503q, offset 9) failed, -517
> [ 23.402512] gpiochip_add_data_with_key: GPIOs 480..495 (sx1503q) failed to register, -517
>
> Without the hog, the 3310 PHY doesn't come out of reset, so I lose
> port 9 on the first switch.
>
> With that removed, I can boot, and if I bring up sff2, I see the port 9
> on the second switch status report 0xef4b and control 0x303f without
> fiber connected. I'm out of time to do anything further on this today
> (not even decode those), because its taken all morning to get the board
> to this point, and I won't have any time tomorrow either.

Okay, I have a solution for the serdes ports on the mv88e6390 family
of switches (I hope all serdes blocks are the same across those)
tested on the ZII rev C - as expected, that requires no further
changes to phylink beyond what I've already stated in these threads.

It's a bit hacky at the moment because of all the layering that's
going on in DSA and mv88e6xxx - which will become worse if we split
the current phylink_mac_ops into separate MAC and PCS operations -
giving a number of extra problems.

I've pulled the patches into my "zii" branch - no idea if that will
build, I've a fair number of experimental phylink patches that I've
been working on for the split PCS/MAC issue that aren't a part of
that series. I may need to shuffle some patches around...

Much of the base of the "phy" branch base is what was submitted for
net-next - "net: phy: provide phy driver start/stop hooks" marks the
boundary between stuff in net-next (before that commit) and stuff
which isn't. I was planning to submit "net: linkmode: make
linkmode_test_bit() take const pointer" through "net: phylink: clarify
flow control settings in documentation" for this merge window, but it
wasn't tested enough to make it in time.

There's also a fix for DSA buried in there; DSA fails to call
phylink_stop() before tearing down phylink for DSA and CPU ports
"net: dsa: fix phylink_start()/phylink_stop() calls". I'm not sure
that's the best solution yet, but I just hacked something up so that
unloading the mv88e6xxx module could be done reliably.

While working on this, I didn't notice where the behaviour you
described wrt serdes was coming from, so it'll be interesting to see
whether the issue still exists. It may be wise if you enable phylink
debugging to grab what's going on, particularly with the mac_config()
calls before trying out any of the above patches.

For others, the Clause 37 patch and a few others have changed a bit
since I posted it as a result of working on these issues. All this is
very much "up in the air" still.

I was planning to spend more time on this today, but unfortunately
other issues have got in the way, so I've pushed the stuff out, and
see what 0-day finds - I may be able to do a bit more later today.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up