Re: [PATCH v4] gpiolib: Bind gpio_device to a driver to enable fw_devlink=on by default

From: Andy Shevchenko
Date: Fri Jan 22 2021 - 04:59:28 EST


On Fri, Jan 22, 2021 at 12:40 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> There are multiple instances of GPIO device tree nodes of the form:
>
> foo {
> compatible = "acme,foo";
> ...
>
> gpio0: gpio0@xxxxxxxx {
> compatible = "acme,bar";
> ...
> gpio-controller;
> };
>
> gpio1: gpio1@xxxxxxxx {
> compatible = "acme,bar";
> ...
> gpio-controller;
> };
>
> ...
> }
>
> bazz {
> my-gpios = <&gpio0 ...>;
> }
>
> Case 1: The driver for "foo" populates struct device for these gpio*
> nodes and then probes them using a driver that binds with "acme,bar".
> This driver for "acme,bar" then registers the gpio* nodes with gpiolib.
> This lines up with how DT nodes with the "compatible" property are
> typically converted to struct devices and then registered with driver
> core to probe them. This also allows the gpio* devices to hook into all
> the driver core capabilities like runtime PM, probe deferral,
> suspend/resume ordering, device links, etc.
>
> Case 2: The driver for "foo" doesn't populate struct devices for these
> gpio* nodes before registering them with gpiolib. Instead it just loops
> through its child nodes and directly registers the gpio* nodes with
> gpiolib.
>
> Drivers that follow case 2 cause problems with fw_devlink=on. This is
> because fw_devlink will prevent bazz from probing until there's a struct
> device that has gpio0 as its fwnode (because bazz lists gpio0 as a GPIO
> supplier). Once the struct device is available, fw_devlink will create a
> device link with gpio0 device as the supplier and bazz device as the
> consumer. After this point, since the gpio0 device will never bind to a
> driver, the device link will prevent bazz device from ever probing.
>
> Finding and refactoring all the instances of drivers that follow case 2
> will cause a lot of code churn and it is not something that can be done
> in one shot. In some instances it might not even be possible to refactor
> them cleanly. Examples of such instances are [1] [2].
>
> This patch works around this problem and avoids all the code churn by
> simply setting the fwnode of the gpio_device and creating a stub driver
> to bind to the gpio_device. This allows all the consumers to continue
> probing when the driver follows case 2.

...

> @@ -596,6 +596,7 @@ int gpiochip_add_data_with_key(struct gpio_chip *gc, void *data,
> gdev->dev.of_node = gc->of_node;
> else
> gc->of_node = gdev->dev.of_node;
> + gdev->dev.fwnode = of_fwnode_handle(gdev->dev.of_node);

This looks like a complete breakage on ACPI enabled systems. Care to
test on ACPI and confirm it works? I could recommend to use Intel
Galileo platform since there is a dwapb GPIO and a lot of nice
(semi-sarcastic, because of many quirks) code over the kernel.

(Yes, you have to have both OF and ACPI in the config being enabled
which is valid combination)

> #endif

...

> +static int gpio_stub_drv_probe(struct device *dev)
> +{
> + /*
> + * The DT node of some GPIO chips have a "compatible" property, but
> + * never have a struct device added and probed by a driver to register
> + * the GPIO chip with gpiolib. In such cases, fw_devlink=on will cause
> + * the consumers of the GPIO chip to get probe deferred forever because

get a probe

> + * they will be waiting for a device associated with the GPIO chip
> + * firmware node to get added and bound to a driver.
> + *
> + * To allow these consumers to probe, we associate the struct
> + * gpio_device of the GPIO chip with the firmware node and then simply
> + * bind it to this stub driver.
> + */
> + return 0;
> +}

...

> + if (driver_register(&gpio_stub_drv) < 0) {
> + pr_err("gpiolib: could not register GPIO stub driver\n");
> + bus_unregister(&gpio_bus_type);
> + return ret;
> + }
> +
> ret = alloc_chrdev_region(&gpio_devt, 0, GPIO_DEV_MAX, GPIOCHIP_NAME);
> if (ret < 0) {
> pr_err("gpiolib: failed to allocate char dev region\n");
> + driver_unregister(&gpio_stub_drv);
> bus_unregister(&gpio_bus_type);
> return ret;
> }

Looks like you missed to fix the __exit call in this module.

--
With Best Regards,
Andy Shevchenko