Re: [PATCH v2 18/19] Revert "fbdev: Prevent probing generic drivers if a FB is already registered"

From: Geert Uytterhoeven
Date: Tue Apr 05 2022 - 20:21:56 EST


Hi Daniel,

On Tue, Apr 5, 2022 at 1:48 PM Daniel Vetter <daniel@xxxxxxxx> wrote:
> On Tue, 5 Apr 2022 at 11:52, Javier Martinez Canillas
> <javierm@xxxxxxxxxx> wrote:
> > On 4/5/22 11:24, Daniel Vetter wrote:
> > > On Tue, 5 Apr 2022 at 11:19, Javier Martinez Canillas
> > >> This is how I think that work, please let me know if you see something
> > >> wrong in my logic:
> > >>
> > >> 1) A PCI device of OF device is registered for the GPU, this attempt to
> > >> match a registered driver but no driver was registered that match yet.
> > >>
> > >> 2) The efifb driver is built-in, will be initialized according to the link
> > >> order of the objects under drivers/video and the fbdev driver is registered.
> > >>
> > >> There is no platform device or PCI/OF device registered that matches.
> > >>
> > >> 3) The DRM driver is built-in, will be initialized according to the link
> > >> order of the objects under drivers/gpu and the DRM driver is registered.
> > >>
> > >> This matches the device registered in (1) and the DRM driver probes.
> > >>
> > >> 4) The DRM driver .probe kicks out any conflicting DRM drivers and pdev
> > >> before registering the DRM device.
> > >>
> > >> There are no conflicting drivers or platform device at this point.
> > >>
> > >> 5) Latter at some point the drivers/firmware/sysfb.c init function is
> > >> executed, and this registers a platform device for the generic fb.
> > >>
> > >> This device matches the efifb driver registered in (2) and the fbdev
> > >> driver probes.
> > >>
> > >> Since that happens *after* the DRM driver already matched, probed
> > >> and registered the DRM device, that is a bug and what the reverted
> > >> patch worked around.
> > >>
> > >> So we need to prevent (5) if (1) and (3) already happened. Having a flag
> > >> set in the fbdev core somewhere when remove_conflicting_framebuffers()
> > >> is called could be a solution indeed.
> > >>
> > >> That is, the fbdev core needs to know that a DRM driver already probed
> > >> and make register_framebuffer() fail if info->flag & FBINFO_MISC_FIRMWARE
> > >>
> > >> I can attempt to write a patch for that.
> > >
> > > Ah yeah that could be an issue. I think the right fix is to replace
> > > the platform dev unregister with a sysfb_unregister() function in
> > > sysfb.c, which is synced with a common lock with the sysfb_init
> > > function and a small boolean. I think I can type that up quickly for
> > > v3.
> >
> > It's more complicated than that since sysfb is just *one* of the several
> > places where platform devices can be registered for video devices.
> >
> > For instance, the vga16fb driver registers its own platform device in
> > its module_init() function so that can also happen after the conflicting
> > framebuffers (and associated devices) were removed by a DRM driver probe.
> >
> > I tried to minimize the issue for that particular driver with commit:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0499f419b76f
> >
> > But the point stands, it all boils down to the fact that you have two
> > different subsystems registering video drivers and they don't know all
> > about each other to take a proper decision.
> >
> > Right now the drm_aperture_remove_conflicting_framebuffers() call signals
> > in one direction from DRM to fbdev but there isn't a communication in the
> > other direction, from fbdev to DRM.
> >
> > I believe the correct fix would be for the fbdev core to keep a list of
> > the apertures struct that are passed to remove_conflicting_framebuffers(),
> > that way it will know what apertures are not available anymore and prevent
> > to register any fbdev framebuffer that conflicts with one already present.
>
> Hm that still feels like reinventing a driver model, badly.
>
> I think there's two cleaner solutions:
> - move all the firmware driver platform_dev into sysfb.c, and then
> just bind the special cases against that (e.g. offb, vga16fb and all
> these). Then we'd have one sysfb_try_unregister(struct device *dev)
> interface that fbmem.c uses.
> - let fbmem.c call into each of these firmware device providers, which
> means some loops most likely (like we can't call into vga16fb), so
> probably need to move that into fbmem.c and it all gets a bit messy.
>
> > Let me know if you think that makes sense and I can attempt to write a fix.
>
> I still think unregistering the platform_dev properly makes the most

That doesn't sound very driver-model-aware to me. The device is what
the driver binds to; it does not cease to exist.

> sense, and feels like the most proper linux device model solution
> instead of hacks on top - if the firmware fb is unuseable because a
> native driver has taken over, we should nuke that. And also the
> firmware fb driver would then just bind to that platform_dev if it
> exists, and only if it exists. Also I think it should be the
> responsibility of whichever piece of code that registers these
> platform devices to ensure that platform_dev actually still exists.
> That's why I think pushing all that code into sysfb.c is probably the
> cleanest solution.

Can't you unbind the generic driver first, and bind the specific driver
afterwards? Alike writing to sysfs unbind/driver_override/bind,
but from code?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@xxxxxxxxxxxxxx

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds