Re: [PATCH] net: phy: at803x: don't depend on GPIOLIB

From: Sebastian Frias
Date: Tue Mar 22 2016 - 10:34:41 EST


Hi Uwe,

I think we are in a deadlock :-)
I'm going to reply inline below, but I will also send a different email
to Daniel with a small recap.
I think he should share the intent of the "reset" mechanism he
introduced, in particular if it is mandatory.


On 03/21/2016 09:12 PM, Uwe Kleine-König wrote:
>>
>> gpiod=NULL (because the key is not there) or gpiod=ERR (because
>> GPIOLIB=n + my patch) will result in the same behaviour, ie: driver
>> binds, but fails to reset.
>
> Assuming the source for the hardware description is dt (the same
> argument applies if it's ACPI or something else).
>
> The driver uses devm_gpiod_get_optional(..."reset"...). That means some
> hardware has a reset line, some don't. If a reset gpio specification is
> there, that means the hardware has such a signal and it seems that means
> it must not be ignored.

The problem is all those "it seems" :-)
"it seems it must not be ignored" is also an hypothesis considering the
driver is not refusing to work even if it detects that the faulty PHY
has not been provided with a reset line.
Also, further down the thread you acknowledge that's a possibility.

>If there is no reset gpio specification that
> means that driver has to assume there is no reset line. (In the real
> world there might be other reasons the reset line isn't in the device
> tree, but it's not in the scope of the driver to guess why it's missing.
> If it's not there the only sensible thing to assume for the driver is:
> There is no reset line.)

But that is not what the current code does, the current code does not
checks if the DT presents the "reset" property directly.
It assumes that GPIOLIB works and checks GPIOLIB return codes, to
indirectly (and incorrectly) guess if a given device has a "reset" property.
It has not been clearly explained anywhere that the "reset" property for
AT8030 is supposed to be mandatory and that the driver should fail if
not provided with one.
The code does not back up such hypothesis either, ie: GPIOLIB=y +
missing "reset" will still allow the driver to work.

>
> So the conclusions are: If there is a reset line in dt, it must be used.
> If you don't know if there is a reset line (because GPIOLIB=n) the
> driver should not bind. Everything else yields to more problems than
> good.

a) "If there is a reset line in dt, it must be used": that is an
hypothesis (I would say "it can be used");
b) if it were true, then another way of knowing if the "reset" key is
present should be used;
indeed, there should be:
c) a way to unambiguously determine if the "reset" key is in the DT
(regardless of GPIOLIB status);
that in order to know if the DT is specifying that the "reset" mechanism
must be used, then:
d) iff the "reset" is absolutely necessary (ie: the faulty PHY is in use
AND the original intention of the hack was for it *mandatory* for faulty
PHYs), some decision can be taken;

However, right now those conditions are not met.

>>> With your patch and GPIOLIB=n you bind the driver even for the devices
>>> that need the hack. This is broken!
>>
>> It is a degraded mode I agree and had proposed to print a warning.
>> However, I fail to see how is GPIOLIB=n different from just not having
>> "reset" present.
>
> GPIOLIB=n means "the driver doesn't know if a reset line is
> present/necessary", not having reset means "there is no reset line".

That's the problem, that the driver is relying on GPIOLIB to know if:
- "reset" key is present
- it should fail to bind or not

and it does that based solely on GPIOLIB return codes, thus indirectly.
As opposed to checking directly the presence of the "reset" key
(regardless of GPIOLIB status) and failing immediately.

>
> And don't do error handling by printk assuming anyone will read it. That
> doesn't work.

I proposed a warning, not error handling.
Also if we assume people won't read the logs, why put any at all?
Some people may not read the logs, some others will, it is their choice
to bang their heads one way or another.
Currently there's no warning.
So I had to debug why the driver was not binding, and it is not obvious
that GPIOLIB must be selected, considering:
- I don't have a faulty PHY
- I don't have a "reset" key on DT
- Even if I wanted, I don't have a GPIO to connect to it

(also, even if this driver does not bind, some generic one would take over)

>
>> The fact that GPIOLIB=y does not means that the "reset" key will be there.
>
> Right, but with GPIOLIB=y you know if it's there, and if yes, you can
> control the line.

Exactly, and what I say is that knowing if "reset" is there or not
should not depend on having GPIOLIB=y

To me the problem comes from the aliasing between "GPIOLIB=y" and
"reset" present.
Indeed, the driver should query if the "reset" key is present
(regardless of GPIOLIB status), and then try to acquire the GPIO line.
If it cannot get the reset GPIO, then it can give up saying "you told me
I need a reset line but then you did not give me one".

>
>> I mean, you assume that "reset" will be present for devices that need
>> the hack, yet there is no such guarantee and the code clearly assumes
>> that it can be missing.
>
> The driver must assume that the device tree is right. If it's not fix
> the device tree, don't implement clumsy work arounds in the driver.

I doubt it is like that.
I mean, it may sound easy to say "just update the DT and add the 'reset'
key", but the DT is describing a physical connections in this case.
Changing that may not be as easy.

IMHO it would be far better if the driver made the check on the PHY ID
(8030) at probe time and refused to bind if:
- the "reset" is absolutely necessary per design (ie: intended by Daniel)
and
- it did not get a reset line (for whatever reason).

That was not done.
Maybe because it is not possible because the ID may not be known at that
time (strange since it is being probed);
Maybe because Daniel meant the whole thing to be optional;
Or maybe for other reasons.

>
>> For all we know, the key is actually missing on systems that do use the
>> faulty PHY (all systems designed prior to the hack being implemented)
>> And that, regardless of GPIOLIB status.
>
> Then fix the device tree.

Changing the device tree does not automagically creates a connection
between a given GPIO (which must be available) and the PHY reset line.
GPIOs and reset lines are physical things.

But maybe I'm missing something and board designers always route GPIOs
to each and every peripheral's reset line.

>
>> Since the reset line can be missing, it can be missing for whatever reason.
>> Whether it is because the "reset" key is not present or because
>> GPIOLIB=n, it does not matter, it will result on the same outcome.
>>
>> Furthermore, if "reset" is really required for certain devices that need
>> the hack, then both cases:
>> - GPIOLIB=n
>> - "reset" not present
>> should be handled the same way (ie: not bind the driver) for such devices.
>
> If you can detect by other means that "reset is necessary", then yes,
> the driver should ideally also fail in this case with no reset specified
> in dt.

Well, the problem is that right now we don't know what was the intention
of the person that implemented the "reset" mechanism.
Both of us are arguing about our own hypothesis.
You have yours "reset is necessary and driver should not bind without".
And I have mine "nothing in the code implies it is necessary".

>
>> By the way, I think not binding the driver is too strong too.
>> Having a GPIO free and being able to route it to the PHY for reset may
>> not even be possible on some systems.
>> Are they supposed to stop working?
>
> Sometimes no driver is better than an unreliable one.

It appears that if this driver does not binds, then a generic one (which
arguably will also lack the reset behaviour) will take over.
How can that be any better?

>>>
>>> No, if reset was mandatory you'd use gpiod_get instead of
>>> gpiod_get_optional and fail iff that call fails, too.
>>
>> Ok, but the current code for "reset" is using devm_gpiod_get_optional()
>> so "reset" is clearly optional.
>> And that call will return NULL if "reset" is not present, even with
>> GPIOLIB=y
>
> s/even//. And returning NULL is not an error. It means: There is no
> reset line specified in dt.

Indeed, and it is totally possible that the PHY ID is a faulty one, yet
the driver will bind anyway.
You argue about a case "reset is absolutely necessary for some devices",
but the code does not enforce that.

>>
>> I think somehow you try to make a difference on the reason why the
>> reset=INVALID (NULL or ERR), but IMHO there's none.
>
> NULL is not invalid. With devm_gpiod_get_optional the driver asks: "Is
> there a reset gpio, and if yes, which one?". The reset line is optional,
> that means "some devices have one, some others don't.". It does NOT mean
> "You can ignore if there is a reset line or not."!

But the current code does ignore it :-)
Even for faulty devices.

Again, if you take a faulty PHY and a DT without the "reset" key, the
driver will bind.
Don't you think that goes against your hypothesis of "reset must be
present and available for some devices"?

By the way, is the "reset" key documented somewhere? How would the DT
look like?
Maybe in such documentation there's some explanation about the key being
compulsory or not.

>
>> If somebody using AT8030 PHY (the one that requires the hack) either
>> does not configure a "reset" key on the DT, either does not enable
>> GPIOLIB, the result will be the same.
>> There is no warning in either case and it will run on a degraded mode.
>
> My expectation is that if the dt includes a reset property, it should be
> used. So, there is a fine option for each situation:
>
> - device that doesn't need the hack => everything is fine;
> - AT8030 with reset to a gpio => specify the reset in dt;
> - AT8030 without a controllable reset line => don't specify a reset
> property;

You do realise that in this last case the device will work in degraded
mode, right?

>
> if you have a broken device with a reset line you can even drop the
> reset property from dt and not make use of the reset gpio if you think
> it's a good idea.
>

So you are saying that "reset" is optional, even for faulty PHYs.

>>
>> Ok, then I did not understand what you meant with "the question is
>> obsolete because the device doesn't probe".
>
> if (GPIOLIB=y):
> if (device has a reset property):
> driver runs fine with being able to reset.
> else:
> if (device needs reset to function properly):
> ideally fail to probe
> else:
> driver runs fine without reset and without the need
> to bother the user about the absense of the gpio
> else:
> driver fails to bind

To me it should be:

if (device needs reset):
if (DT has reset):
if (GPIOLIB=y):
gpiod=callback, driver gets a reset line and can operate
properly (ie: perform the hack)
else
gpiod=-ENOSYS, driver does not get reset line: should it
fail to bind? should it continue?
else
gpiod=NULL, driver does not get a reset line: should it fail to
bind? should it continue?
else
gpiod is not even requested, driver does not need a reset line and
can operate properly.

>
> In no branch of these cases there is a situation where you must issue a
> reset but cannot.

Well, the thing is that what you described does not match the code,
AFAIK current code does:

if (GPIOLIB=y):
if (DT has reset):
gpiod=callback
bind driver
if (faulty PHY):
use gpiod
else
gpiod=NULL
bind driver
if (faulty PHY)
cannot use gpiod, but continue anyway without warning
else
gpiod=-ENOSYS, fail to bind, regardless of PHY being faulty, and
regardless of "reset" presence (*)


(*) the way the code is written, "reset" presence is not checked
directly, but as a side-effect of GPIOLIB

>So there is no need to ask if you should issue a
> message each time this happens. That's what I meant with "the question
> is obsolete".
>
>> Unless I'm missing something, the only way the driver won't bind
>> correctly with current code is if GPIOLIB=n
>
> Right, because with GPIOLIB=n you cannot control a gpio line, but you
> don't know if you must.

Well right now nobody knows if the "reset" is actually a must, even for
faulty PHYs.
The driver binds happily without, even on faulty devices.

>
> Consider you are an electrician (driver) and you should repair a power
> socket (control a device) but forgot your phase tester (GPIOLIB=n).
> So there is no way you can determine if it's save to modify the socket.
> You can choose the optimistic way and say: "If there is no voltage on the
> socket, I can repair it successfully" and just try it. Or you take the
> safe approach and say: "I don't know if it's safe to do, so I better
> don't.".

:-) thanks for the analogy.

>
> The same applies to drivers: If it might be necessary to handle a gpio,
> but you cannot know if that's the case or not: Don't try it.
>

But that's not how the code was written.
It is not mentioned anywhere that that was the initial intention either.

>>
>> To put an example: in our case we don't use the faulty PHY.
>> So, the DT does not has the "reset" key.
>> Why should then the PHY driver be dependent on GPIOLIB?
>
> You want GPIOLIB to know that you don't need to handle the reset gpio.

But that's like saying "you want all possible drivers linked-in, if they
are not in DT they won't load/probe, so it's fine"

We don't want GPIOLIB because it is a bunch of useless code in our case.
We know in advance that the DT will not have the "reset" key.
We know in advance that the PHY is not faulty.
Why should we link-in a bunch of useless code then?

Alternatively, if GPIOLIB is mandatory due to the way drivers are
written or the way DT works, then why make it optional?

Best regards,

Sebastian