Re: [RFC 1/1] Driver Core: don't oops with unregistered driver indriver_find_device()

From: Greg Kroah-Hartman
Date: Thu May 10 2012 - 18:46:01 EST


On Thu, May 10, 2012 at 04:34:40PM -0600, Stephen Warren wrote:
> On 05/10/2012 04:28 PM, Greg Kroah-Hartman wrote:
> > On Thu, May 10, 2012 at 02:00:48PM -0600, Stephen Warren wrote:
> >> On 05/10/2012 12:16 PM, Greg Kroah-Hartman wrote:
> >>> On Thu, May 10, 2012 at 09:59:15AM -0600, Stephen Warren wrote:
> >>>> On 05/10/2012 08:11 AM, Greg Kroah-Hartman wrote:
> >>>>> On Thu, May 10, 2012 at 10:35:02AM +0300, Hiroshi DOYU wrote:
> >>>>>> driver_find_device() can be called with an unregistered driver.
> >>>>>
> >>>>> Who does that? Where in the kernel? Why would you try to do that?
> >>>>>
> >>>>>> Need to check driver_private to see if it's populated or not,
> >>>>>> especially under deferrable probe.
> >>>>>
> >>>>> Hm, I don't know if this will really catch a driver that was registered
> >>>>> and then was unregistered, right? It seems like moving the real problem
> >>>>> somewhere else, why not fix the original issue instead?
> >>>>>
> >>>>>> Signed-off-by: Hiroshi DOYU <hdoyu@xxxxxxxxxx>
> >>>>>> ---
> >>>>>> In [PATCHv5 2/3] ARM: tegra: Add SMMU enabler in AHB:
> >>>>>> http://article.gmane.org/gmane.linux.ports.tegra/4658
> >>>>>>
> >>>>>> "tegra_ahb_driver" may not be populated when it's called.
> >>>>>
> >>>>> It can? I don't see that in that patch.
> >>>>
> >>>> I think this is what's happening:
> >>>>
> >>>> The Tegra SMMU driver is registered using subsys_initcall().
> >>>>
> >>>> The Tegra AHB driver is registered using module_platform_driver, so
> >>>> module_init().
> >>>>
> >>>> The device objects for both are registered at basically the same time
> >>>> during the machine's initialization call, which I think happens before
> >>>> both or some of the above two calls.
> >>>>
> >>>> So, SMMU ends up getting probed before the AHB driver has even been
> >>>> registered.
> >>>
> >>> What do those two drivers have to do with each other?
> >>
> >> The AHB HW module contains a bit to say "enable the SMMU". This bit
> >> cannot be turned on until certain initialization has been performed by
> >> the SMMU driver.
> >>
> >> So, SMMU's probe performs the initialization, then attempts to contact
> >> the AHB driver to ask it to enable the SMMU.
> >>
> >>>> That's why in patch Hiroshi linked, in tegra_ahb_enable_smmu(), the AHB
> >>>> driver hasn't (or may not have) been registered, so driver_find_device()
> >>>> can experience the issue this patch attempts to solve.
> >>>
> >>> It sounds like you have a locking or ordering problem somewhere in this
> >>> platform, and it needs to be resolved there. Odds are, this tiny check
> >>> is the least of your worries, right?
> >>>
> >>> Who is doing the driver_find_device() call in the first place? The
> >>> driver core? Or something else?
> >>
> >> SMMU's probe calls tegra_ahb_enable_smmu().
> >>
> >> tegra_ahb_enable_smmu() needs to find the AHB's struct device so that it
> >> can find get the driver data associated with it, which contains the
> >> ioremap'd registers amongst other things.
> >>
> >> In order to find the correct device, tegra_ahb_enable_smmu() is passed
> >> the device tree node of the appropriate AHB that controls the SMMU.
> >>
> >> The two drivers use deferred probe to ensure that their relative probe
> >> order doesn't matter. If AHB is probed first, then
> >> tegra_ahb_enable_smmu()'s driver_find_device() succeeds, and the AHB
> >> register is programmed to enable the SMMU. If the SMMU is probed first,
> >> then tegra_ahb_enable_smmu() returns -EPROBE_DEFER to hold off the SMMU
> >> from completing its probe. I believe this is exactly the kind of thing
> >> -EPROBE_DEFER was intended for.
> >
> > Ok, thanks for explaining it better, that would have been nice to have
> > in the original patch submission :)
> >
> > This fix, is it needed for 3.4, or just 3.5?
>
> The SMMU driver that needs this will be added in 3.6 (so putting this
> patch in 3.5 would be good to simplify dependencies). It's not needed in
> 3.4.

Ok, good to know.

> > I kind of don't like relying on ->p as a check to see if things are set
> > up properly, I wonder if we could use the kobject state_initalized field
> > instead somehow, or maybe state_in_sysfs? Oh wait, that's in ->p as
> > well. Ok, I guess this is as good as it can get.
>
> In the case you mentioned of the driver getting unregistered, can the
> unregister code clear ->p?

I do not know. Try it in the bus_remove_driver() call after the
kobject_put() line, that might do it, but it might cause problems, I
can't remember at the moment. It would need a lot of testing

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/