Re: INFO: REPRODUCED: memory leak in gpio device in 6.2-rc6

From: Andy Shevchenko
Date: Tue Feb 21 2023 - 10:42:04 EST


On Tue, Feb 21, 2023 at 02:52:38PM +0100, Mirsad Goran Todorovac wrote:
> On 20. 02. 2023. 14:43, Andy Shevchenko wrote:
> > On Mon, Feb 20, 2023 at 02:10:00PM +0100, Mirsad Todorovac wrote:
> > > On 2/16/23 15:16, Bartosz Golaszewski wrote:

...

> > > As Mr. McKenney once said, a bunch of monkeys with keyboard could
> > > have done it in a considerable number of trials and errors ;-)
> > >
> > > But here I have something that could potentially leak as well. I could not devise a
> > > reproducer due to the leak being lightly triggered only in extreme memory contention.
> > >
> > > See it for yourself:
> > >
> > > drivers/gpio/gpio-sim.c:
> > > 301 static int gpio_sim_setup_sysfs(struct gpio_sim_chip *chip)
> > > 302 {
> > > 303 struct device_attribute *val_dev_attr, *pull_dev_attr;
> > > 304 struct gpio_sim_attribute *val_attr, *pull_attr;
> > > 305 unsigned int num_lines = chip->gc.ngpio;
> > > 306 struct device *dev = chip->gc.parent;
> > > 307 struct attribute_group *attr_group;
> > > 308 struct attribute **attrs;
> > > 309 int i, ret;
> > > 310
> > > 311 chip->attr_groups = devm_kcalloc(dev, sizeof(*chip->attr_groups),
> > > 312 num_lines + 1, GFP_KERNEL);
> > > 313 if (!chip->attr_groups)
> > > 314 return -ENOMEM;
> > > 315
> > > 316 for (i = 0; i < num_lines; i++) {
> > > 317 attr_group = devm_kzalloc(dev, sizeof(*attr_group), GFP_KERNEL);
> > > 318 attrs = devm_kcalloc(dev, GPIO_SIM_NUM_ATTRS, sizeof(*attrs),
> > > 319 GFP_KERNEL);
> > > 320 val_attr = devm_kzalloc(dev, sizeof(*val_attr), GFP_KERNEL);
> > > 321 pull_attr = devm_kzalloc(dev, sizeof(*pull_attr), GFP_KERNEL);
> > > 322 if (!attr_group || !attrs || !val_attr || !pull_attr)
> > > 323 return -ENOMEM;
> > > 324
> > > 325 attr_group->name = devm_kasprintf(dev, GFP_KERNEL,
> > > 326 "sim_gpio%u", i);
> > > 327 if (!attr_group->name)
> > > 328 return -ENOMEM;
> > >
> > > Apparently, if the memory allocation only partially succeeds, in the theoretical case
> > > that the system is close to its kernel memory exhaustion, `return -ENOMEM` would not
> > > free the partially succeeded allocs, would it?
> > >
> > > To explain it better, I tried a version that is not yet full doing "all or nothing"
> > > memory allocation for the gpio-sim driver, because I am not that familiar with the
> > > driver internals.
> >
> > devm_*() mean that the resource allocation is made in a managed manner, so when
> > it's done, it will be freed automatically.
>
> Didn't see that one coming ... :-/ "buzzing though the bush ..."
>
> > The question is: is the lifetime of the attr_groups should be lesser or the
> > same as chip->gc.parent? Maybe it's incorrect to call devm_*() in the first place?
>
> Bona fide said, I hope that automatic deallocation does things in the right order.
> I've realised that devm_kzalloc() calls devm_kmalloc() that registers allocations on
> a per driver list. But I am not sure how chip->gc was allocated?
>
> Here is said it is allocated in drivers/gpio/gpio-sim.c:386 in gpio_sim_add_bank(),
> as a part of
>
> struct gpio_sim_chip *chip;
> struct gpio_chip *gc;
>
> gc = &chip->gc;
>
> and gc->parent is set to
>
> gc->parent = dev;
>
> in line 420, which appears called before gpio_sim_setup_sysfs() and the lines above.
>
> If I understood well, automatic deallocation on unloading the driver goes
> in the reverse order, so lifetime of chip appears to be longer than attr_groups,
> but I am really not that good at this ...

So, the device is instantiated by platform_device_register_full().

It should gone with the platform_device_unregister().

In case of CONFIG_DEBUG_KOBJECT_RELEASE=y the ->release() can be called
asynchronously.

So, there are following questions:
- is the put_device() is actually called?
- is the above mentioned option is set to Y?
- if it's in Y, does kmemleak take it into account?
- if no, do you get anything new in `dmesg` when enable it?

> > Or maybe the chip->gc.parent should be changed to something else (actual GPIO
> > device, but then it's unclear how to provide the attributes in non-racy way
> Really, dunno. I have to repeat that my learning curve cannot adapt so quickly.
>
> I merely gave the report of KMEMLEAK, otherwise I am not a Linux kernel
> device expert nor would be appropriate to try the craft not earned ;-)

--
With Best Regards,
Andy Shevchenko