Re: [PATCH 2/3] gpiolib: use kref in gpio_desc

From: Bartosz Golaszewski
Date: Thu Mar 12 2020 - 14:25:11 EST


czw., 12 mar 2020 o 11:11 Linus Walleij <linus.walleij@xxxxxxxxxx> napisaÅ(a):
>
> On Thu, Mar 5, 2020 at 5:49 PM Bartosz Golaszewski <brgl@xxxxxxxx> wrote:
>
> > I refreshed my memory on device links and reference counting. I think
> > that device links are not the right tool for the problem I'm trying to
> > solve.
>
> OK, just check the below though so we are doing reference
> counting in the right place.
>
> > You're right on the other hand about the need for reference
> > counting of gpiochip devices. Right now if we remove the chip with
> > GPIOs still requested the only thing that happens is a big splat:
> > "REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED".
> >
> > We should probably have a kref on the gpiochip structure which would
> > be set to 1 when registering the chip, increased and decreased on
> > every operation such as requesting and releasing a GPIO respectively
> > and decreased by gpiochip_remove() too. That way if we call
> > gpiochip_remove() while some users are still holding GPIO descriptors
> > then the only thing that happens is: the reference count for this
> > gpiochip is decreased. Once the final consumer calls the appropriate
> > release routine and the reference count goes to 0, we'd call the
> > actual gpiochip release code. This is similar to what the clock
> > framework does IIRC.
>
> I don't think that is consistent with the device model: there is already
> a struct device inside struct gpio_device which is what gets
> created when the gpio_chip is registered.
>
> The struct device inside struct gpio_device contains a
> struct kobject.
>
> The struct kobject contains a struct kref.
>
> This kref is increased and decreased with get_device() and
> put_device(). This is why in the gpiolib you have a bunch of
> this:
> get_device(&gdev->dev);
> put_device(&gdev->dev);
>
> This is used when creating any descriptor handle with
> [devm_]gpiod_request(), linehandles or lineevents.
>
> So it is already reference counted and there is no need to
> introduce another reference counter on gpio_chips.
>

I think there's one significant detail missing here. While it's true
that the life-time of a device object is controlled by its reference
count, its registration with the driver model is not ie.
device_add/del() are called once per device as opposed to
get/put_device().

> The reason why gpiochip_remove() right now
> enforces removal and only prints a warning if you remove
> a gpio_chip with requested GPIOs on it, is historical.
>

Given the above I think that it wouldn't be possible to add reference
counting to gpio devices without a new kref if the task of the release
callback would be to call device_del() once the provider called its
unregister function and all consumers released requested resources.

> When I created the proper device and char device, the gpiolib
> was really just a library (hence the name) not a driver framework.
> Thus there was no real reference counting of anything
> going on, and it was (as I perceived it) pretty common that misc
> platforms just pulled out the GPIO chip underneath the drivers
> using the GPIO lines.
>
> If we would just block that, say refuse to perform the .remove
> action on the module or platform (boardfile) code implementing
> GPIO I was worried that we could cause serious regressions.
>
> But I do not think this is a big problem?
>
> Most drivers these days are using devm_gpiochip_add_data()
> and that will not release the gpiochip until exactly this same
> kref inside struct device inside gpio_device goes down to
> zero.
>

I believe this is not correct. The resources managed by devres are
released when the device is detached from a driver, not when the
device's reference count goes to 0. When the latter happens, the
device's specific (or its device_type's) release callback is called -
for gpiolib this is gpiodevice_release().

The kref inside struct device will not go down to zero until you call
device_del() (if you previously called device_add() that is which
increases the reference count by a couple points). But what I'm
thinking about is making the call to device_del() depend not on the
call to gpiochip_remove() but on the kref on the gpio device going
down to zero. As for the protection against module removal - this
should be handled by module_get/put().

I may be wrong of course but when looking at the code, this is what I
understand. Please let me know what you think.

Best regards
Bartosz Golaszewski