Re: [PATCH net-next v3] net: phy: micrel: add phy-mode support for the KSZ9031 PHY

From: Geert Uytterhoeven
Date: Wed May 27 2020 - 15:11:40 EST


Hi Oleksij,

On Wed, Apr 29, 2020 at 11:26 AM Oleksij Rempel <o.rempel@xxxxxxxxxxxxxx> wrote:
> On Wed, Apr 29, 2020 at 10:45:35AM +0200, Geert Uytterhoeven wrote:
> > On Tue, Apr 28, 2020 at 6:16 PM Philippe Schenker
> > <philippe.schenker@xxxxxxxxxxx> wrote:
> > > On Tue, 2020-04-28 at 17:47 +0200, Andrew Lunn wrote:
> > > > On Tue, Apr 28, 2020 at 05:28:30PM +0200, Geert Uytterhoeven wrote:
> > > > > This triggers on Renesas Salvator-X(S):
> > > > >
> > > > > Micrel KSZ9031 Gigabit PHY e6800000.ethernet-ffffffff:00:
> > > > > *-skew-ps values should be used only with phy-mode = "rgmii"
> > > > >
> > > > > which uses:
> > > > >
> > > > > phy-mode = "rgmii-txid";
> > > > >
> > > > > and:
> > > > >
> > > > > rxc-skew-ps = <1500>;
> > > > >
> > > > > If I understand Documentation/devicetree/bindings/net/ethernet-
> > > > > controller.yaml
> > > > > correctly:
> > > >
> > > > Checking for skews which might contradict the PHY-mode is new. I think
> > > > this is the first PHY driver to do it. So i'm not too surprised it has
> > > > triggered a warning, or there is contradictory documentation.
> > > >
> > > > Your use cases is reasonable. Have the normal transmit delay, and a
> > > > bit shorted receive delay. So we should allow it. It just makes the
> > > > validation code more complex :-(
> > >
> > > I reviewed Oleksij's patch that introduced this warning. I just want to
> > > explain our thinking why this is a good thing, but yes maybe we change
> > > that warning a little bit until it lands in mainline.
> > >
> > > The KSZ9031 driver didn't support for proper phy-modes until now as it
> > > don't have dedicated registers to control tx and rx delays. With
> > > Oleksij's patch this delay is now done accordingly in skew registers as
> > > best as possible. If you now also set the rxc-skew-ps registers those
> > > values you previously set with rgmii-txid or rxid get overwritten.

While I don't claim that the new implementation is incorrect, my biggest
gripe is that this change breaks existing setups (cfr. Grygorii's report,
plus see below). People fine-tuned the parameters in their DTS files
according to the old driver behavior, and now have to update their DTBs,
which violates DTB backwards-compatibility rules.
I know it's ugly, but I'm afraid the only backwards-compatible solution
is to add a new DT property to indicate if the new rules apply.

> > > We chose the warning to occur on phy-modes 'rgmii-id', 'rgmii-rxid' and
> > > 'rgmii-txid' as on those, with the 'rxc-skew-ps' value present,
> > > overwriting skew values could occur and you end up with values you do
> > > not wanted. We thought, that most of the boards have just 'rgmii' set in
> > > phy-mode with specific skew-values present.
> > >
> > > @Geert if you actually want the PHY to apply RXC and TXC delays just
> > > insert 'rgmii-id' in your DT and remove those *-skew-ps values. If you
> >
> > That seems to work for me, but of course doesn't take into account PCB
> > routing.

Of course I talked too soon. Both with the existing DTS that triggers
the warning, and after changing the mode to "rgmii-id", and dropping the
*-skew-ps values, Ethernet became flaky on R-Car M3-W ES1.0. While the
system still boots, it boots very slow.
Using nuttcp, I discovered TX performance dropped from ca. 400 Mbps to
0.1-0.3 Mbps, while RX performance looks unaffected.

So I did some more testing:
1. Plain "rgmii-txid" and "rgmii" break the network completely, on all
R-Car Gen3 platforms,
2. "rgmii-id" and "rgmii-rxid" work, but cause slowness on R-Car M3-W,
3. "rgmii" with *-skew-ps values that match the old values (default
420 for everything, but default 900 for txc-skew-ps, and the 1500
override for rxc-skew-ps), behaves exactly the same as "rgmii-id",
4. "rgmii-txid" with *-skew-ps values that match the old values does
work, i.e.
adding to arch/arm64/boot/dts/renesas/salvator-common.dtsi:
+ rxd0-skew-ps = <420>;
+ rxd1-skew-ps = <420>;
+ rxd2-skew-ps = <420>;
+ rxd3-skew-ps = <420>;
+ rxdv-skew-ps = <420>;
+ txc-skew-ps = <900>;
+ txd0-skew-ps = <420>;
+ txd1-skew-ps = <420>;
+ txd2-skew-ps = <420>;
+ txd3-skew-ps = <420>;
+ txen-skew-ps = <420>;

You may wonder what's the difference between 3 and 4? It's not just the
PHY driver that looks at phy-mode!
drivers/net/ethernet/renesas/ravb_main.c:ravb_set_delay_mode() also
does, and configures an additional TX clock delay of 1.8 ns if TXID is
enabled. Doing so fixes R-Car M3-W, but doesn't seem to be needed,
or harm, on R-Car H3 ES2.0 and R-Car M3-N.

> > Using "rgmii" without any skew values makes DHCP fail on R-Car H3 ES2.0,
> > M3-W (ES1.0), and M3-N (ES1.0). Interestingly, DHCP still works on R-Car
> > H3 ES1.0.

FTR, the reason R-Car H3 ES1.0 is not affected is that the driver limits
its maximum speed to 100 Mbps, due to a hardware erratum.

So, how to proceed?
Thanks!

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds