Re: [PATCH v3 1/2] driver core: Rework logic in __driver_deferred_probe_check_state to allow EPROBE_DEFER to be returned for longer

From: Rob Herring
Date: Thu Feb 20 2020 - 10:21:12 EST


On Wed, Feb 19, 2020 at 11:27 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> Apologies in advance for replying to this one email but discussing the
> points raised in all the other replies. I'm not cc'ed in this thread and
> replying to each email individually is a pain.
>
> On Tue, Feb 18, 2020 at 4:07 PM John Stultz wrote:
> > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > at the end of initcall"), along with commit 25b4e70dcce9
> > ("driver core: allow stopping deferred probe after init") after
> > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > instead see an error causing the driver to fail to load.
>
> Both of those patches were the best solution at that point in time. But
> the kernel has changed a lot since then. Power domain and IOMMU drivers
> can work as modules now. We have of_devlink and sync_state().
>
> So, while a delay might have been the ideal solution back then, I think
> we need to work towards removing arbitrary timeouts instead of making
> the timeout mechanism more elaborate.

If you don't have some way to say all the dependencies that can be
resolved have been resolved already, how do you get away from a
timeout? Nothing has changed in that respect.

If a dtb+kernel works, updating just the dtb with new dependencies
should not break things.

> I think driver_deferred_probe_check_state() logic should boiled down
> to something like:
>
> int driver_deferred_probe_check_state(struct device *dev)
> {
> /* No modules and init done, deferred probes are pointless from
> * now. */
> if (!defined(CONFIG_MODULES) && initcall_done)
> return -ENODEV;
>
> /* If modules and of_devlink then you clean want dependencies to
> * be enforced.
> */
> if (defined(CONFIG_MODULES) && of_devlink)
> return -EPROBE_DEFER;
>
> /* Whatever other complexity (including timeouts) we want to
> * add. Hopefully none - we can discuss in this thread. */
> if (.....)
> return -Exxxx;
>
> /* When in doubt, allow probe deferral. */
> return -EPROBE_DEFER;
> }

Mostly makes sense to me. I think using CONFIG_MODULES is good.

However, there is the case in pinctrl that we have a DT flag
'pinctrl-use-default' and we stop deferring at the end of initcalls if
set. With the above, there's no way to detect that. I'm pretty sure
that's broken now with of_devlink and maybe from Thierry's change too.

> Rob, for the original use case those two patches were added for, do they need
> to support CONFIG_MODULES?

At the time since the subsystems involved were not typically modules
so using CONFIG_MODULES didn't really matter. As you said, that's
changed now.

> > That change causes trouble when trying to use many clk drivers
> > as modules, as the clk modules may not load until much later
> > after init has started. If a dependent driver loads and gets an
> > error instead of EPROBE_DEFER, it won't try to reload later when
> > the dependency is met, and will thus fail to load.
>
> Once we add of_devlink support for power-domains, you won't even hit the
> genpd error path if you have of_devlink enabled. I believe in the case
> you are testing DB 845c, of_devlink is enabled?
>
> If of_devlink is enabled, the devices depending on the unprobed power
> domains would be deferred without even calling the driver's probe()
> function.
>
> Adding power-domain support to of_devlink is a 2 line change. I'll send
> it out soon.
>
> Also, regulator_init_complete() can be replaced by a sync_state() based
> implementation. I have a downstream implementation that works. But it's
> definitely not upstream ready. I plan to rewrite it and send it upstream
> at some point, but it's fairly straightforward if anyone else want to
> implement it. My main point being, we shouldn't have to make the timeout
> logic more complex (or even need one) for regulator clean up either.
>
> On Tue, Feb 18, 2020 at 6:07 PM Rob Herring wrote:
> > The one complication which I mentioned already is with consoles. A
> > timeout (and dependencies in modules) there doesn't work. You have to
> > probe and register the console before init is done.
>
> Rob,
>
> I've seen you say this a couple of times before. But I don't think this
> is true any more. With of_devlink enabled I've booted hardware countless
> times with the console device probing after userspace comes up. The only
> limitation for console drivers is that they need to be built-in if they
> need to support earlycon. If you don't care to support earlycon (very
> useful for bringup debugging), I think the console driver can even be a
> module. I don't think even of_devlink needs to be enabled technically if
> you load the modules in the right order.

Every serial driver has to be built-in to enable console support.
That's not because of earlycon. It's been that way for as long as I've
worked on linux. Now of course, a driver could be built-in and still
probe after userspace starts, but in my testing with the timeout that
didn't work. I don't see how of_devlink changes that.

It could depend on what userspace you have. Certainly, booting with
'console=ttyS0 init=/bin/sh' would not work for example. What I
probably tested with was a busybox based rootfs. What are you testing
with? Android?

Rob