Re: [PATCH v3 1/2] driver core: Rework logic in __driver_deferred_probe_check_state to allow EPROBE_DEFER to be returned for longer

From: Rob Herring
Date: Tue Feb 18 2020 - 18:53:15 EST


On Tue, Feb 18, 2020 at 5:21 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
>
> On Tue, Feb 18, 2020 at 2:51 PM Rob Herring <robh@xxxxxxxxxx> wrote:
> >
> > On Tue, Feb 18, 2020 at 4:07 PM John Stultz <john.stultz@xxxxxxxxxx> wrote:
> > >
> > > Due to commit e01afc3250255 ("PM / Domains: Stop deferring probe
> > > at the end of initcall"), along with commit 25b4e70dcce9
> > > ("driver core: allow stopping deferred probe after init") after
> > > late_initcall, drivers will stop getting EPROBE_DEFER, and
> > > instead see an error causing the driver to fail to load.
> > >
> > > That change causes trouble when trying to use many clk drivers
> > > as modules, as the clk modules may not load until much later
> > > after init has started. If a dependent driver loads and gets an
> > > error instead of EPROBE_DEFER, it won't try to reload later when
> > > the dependency is met, and will thus fail to load.
> > >
> > > This patch reworks some of the logic in
> > > __driver_deferred_probe_check_state() so that if the
> > > deferred_probe_timeout value is set, we will return EPROBE_DEFER
> > > until that timeout expires, which may be after initcalls_done
> > > is set to true.
> > >
> > > Specifically, on db845c, this change (when combined with booting
> > > using deferred_probe_timeout=30) allows us to set SDM_GPUCC_845,
> > > QCOM_CLK_RPMH and COMMON_CLK_QCOM as modules and get a working
> > > system, where as without it the display will fail to load.
> >
> > I would change the default for deferred_probe_timeout to 30 and then
> > regulator code can rely on that.
>
> That is exactly what I do in the following patch! Let me know if you
> have further suggestions for it.

Indeed.

Between the above comment and this comment in patch 2:
/* preserve 30 second interval if deferred_probe_timeout=-1 */

...I was confused.

> > Curious, why 30 sec is fine now when
> > you originally had 2 min? I'd just pick what you think is best. I
> > doubt Mark had any extensive experiments to come up with 30sec.
>
> I had two minutes initially as, due to other delays I still have to
> chase, booting all the way to UI on the db845c can sometimes take
> almost a minute. So it was just a rough safety estimate. But in v2, I
> tested with the 30 second time used by the regulator code, and that
> seemed to work ok.
>
> As long as 30 seconds is working well, I'm ok with it for now (and it
> can be overrided via boot arg). Though from past experience with
> enterprise distros running on servers with tons of disks (which might
> take many minutes to initialize), I'd suspect its likely some
> environments may need much longer.

Those systems aren't going to have a pinctrl or clock driver sitting
in an encrypted RAID partition either. :)

Rob