Re: [PATCH v2 08/10] of/platform: Resolve interrupt references atprobe time

From: Thierry Reding
Date: Fri Oct 25 2013 - 03:35:44 EST

On Thu, Oct 24, 2013 at 05:37:49PM +0100, Grant Likely wrote:
> On Wed, 16 Oct 2013 00:24:36 +0100, Grant Likely <grant.likely@xxxxxxxxxx> wrote:
> > On Wed, 18 Sep 2013 15:24:50 +0200, Thierry Reding <thierry.reding@xxxxxxxxx> wrote:
> > > Interrupt references are currently resolved very early (when a device is
> > > created). This has the disadvantage that it will fail in cases where the
> > > interrupt parent hasn't been probed and no IRQ domain for it has been
> > > registered yet. To work around that various drivers use explicit
> > > initcall ordering to force interrupt parents to be probed before devices
> > > that need them are created. That's error prone and doesn't always work.
> > > If a platform device uses an interrupt line connected to a different
> > > platform device (such as a GPIO controller), both will be created in the
> > > same batch, and the GPIO controller won't have been probed by its driver
> > > when the depending platform device is created. Interrupt resolution will
> > > fail in that case.
> >
> > What is the reason for all the rework on the irq parsing return values?
> > A return value of '0' is always an error on irq parsing, regardless of
> > architecture even if NO_IRQ is defined as -1. I may have missed it, but
> > I don't see any checking for specific error values in the return paths
> > of the functions.
> >
> > If the specific return value isn't required (and I don't think it is),
> > then you can simplify the whole series by getting rid of the rework
> > patches.
> I've not heard back about the above, but I've just had a conversation
> with Rob about what to do here.

I thought I had sent a reply regarding this about a week ago. Perhaps it
got lost. I'll resend.

> The problem that I have is that it makes a specific return code need
> to traverse several levels of function calls and have a meaning come
> out the other end. It becomes difficult to figure out where that code
> actually comes from when reading the code. That's more of a gut-feel
> reaction rather than pointing out specifics though.

To be honest, I'm not all that happy with that aspect myself, but at the
same time I didn't feel like duplicating a lot of code to get this done
more easily. I imagine that would've caused significant pushback as
well. It's somewhat unfortunate that we have to propagate back through
several level, but that's just the way the code is currently written and
I don't think we can really get the information (EPROBE_DEFER) from any
other place but from the lowest level.

> The other thing that makes me nervous how invasive the series is.

Well, I guess that comes with the territory, doesn't it? Interrupts are
used in a large number of places and they have been used in a very
static manner so far. The end result of this patch series is that for
most devices instantiated from the device tree interrupts end up in the
same category as any other resources such as GPIOs, regulators or
clocks. They become mostly dynamic.

That in itself is a big change, so I don't think it's all that
surprising that the required changes are invasive.

And I think if we really want to solve it properly we need to make even
more invasive changes. For example, Grygorii pointed out that we could
have a setup in the future where the following happens:

1) driver providing interrupts is probed
2) driver using interrupts is probed, interrupt references are
resolved at probe time
3) both drivers are unloaded
4) both drivers are reloaded

In that case with the current set of patches the added core code assumes
that the interrupts have already been resolved and are still valid.

Possibly the easiest way to fix that would be to just zero out the
interrupt resources on remove so that they can be re-resolved on next

But that's somewhat cumbersome and it seems to me like a better fix
might be to go and change struct platform_device to not use a single
array of resources, but rather a list, or perhaps an array per type of
resource. The current platform_device structure is simple and easy, but
it doesn't work well with all the new dynamicity that we want/need/have

Obviously modifying the innards of struct platform_device will likely
turn out to be a mammoth task of its own, but if that's what it takes
I'm prepared to do that as well. Or at least even try.

> However, even with saying all of the above, I'm not saying outright no.
> I want to get this feature in.

That's good to hear. Last time we talked about it we seemed to have an
agreement that this needs to be done, but you not replying had me
worried that perhaps you had changed your mind. It seems you've been
busy trying to address other issues that maybe are even more pressing so
I can hardly complain. =)

I'm good as long as we can keep moving in the right direction.

> It is obviously needed and I'll even merge the patches piecemeal as the
> look ready (I've already merged 2). Regardless, the current series needs
> to be reworked. It conflicts with the other IRQ rework that I've already
> put into my tree. The best thing to do would probably be respin it
> against my current tree and repost.

Sure, that won't be a problem. I might not get to it immediately, but
I'll get back to it.

> I'll take a fresh look then.... In the mean time, anything you can do to
> /sanely/ reduce the impact will probably help. :-)

I might be able to do that. But I'll mention that in another thread in
the right context.


Attachment: pgp00000.pgp
Description: PGP signature