Re: [PATCH v2] nvmem: core: Fix race in nvmem_register()

From: Russell King (Oracle)
Date: Tue Jan 03 2023 - 10:23:26 EST


On Wed, Jan 04, 2023 at 12:14:10AM +0900, Hector Martin wrote:
> On 04/01/2023 00.06, Russell King (Oracle) wrote:
> > Hi Hector,
> >
> > On Tue, Jan 03, 2023 at 10:48:52PM +0900, Hector Martin wrote:
> >>>> @@ -822,11 +822,8 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config)
> >>>> break;
> >>>> }
> >>>>
> >>>> - if (rval) {
> >>>> - ida_free(&nvmem_ida, nvmem->id);
> >>>> - kfree(nvmem);
> >>>> - return ERR_PTR(rval);
> >>>> - }
> >>>> + if (rval)
> >>>> + goto err_gpiod_put;
> >>>
> >>> Why was gpiod changes added to this patch, that should be a separate
> >>> patch/discussion, as this is not relevant to the issue that you are
> >>> reporting.
> >>
> >> Because freeing the device also does a gpiod_put in the destructor, so
> >> doing this is correct in every other instance below and maintains
> >> existing behavior, and it just so happens that this instance converges
> >> into the same codepath so it is correct to merge it, and it just so
> >> happens that the gpiod put was missing in this path to begin with so
> >> this becomes a drive-by bugfix.
> >>
> >> If you don't like it I can remove it (i.e. reintroduce the bug for no
> >> good reason) and you can submit this fix yourself, because I have no
> >> incentive to waste time submitting a separate patch to fix a GPIO leak
> >> in an error path corner case in a subsystem I don't own and I have much
> >> bigger things to spend my (increasingly lower and lower) willingness to
> >> fight for upstream submissions than this.
> >>
> >> Seriously, what is wrong with y'all kernel people. No other open source
> >> project wastes contributors' time with stupid nitpicks like this. I
> >> found a bug, I fixed it, I then fixed the issues you pointed out, and I
> >> don't have the time nor energy to fight over this kind of nonsense next.
> >> Do you want bugs fixed or not?
> >
> > This is not nonsense. We have always had a policy of one fix/change
> > per patch, and in this case it makes complete and utter sense. Of
> > course, the interpretation of "one change" is a matter of opinion.
>
> The change here is the race condition fix. That change involves adding
> an error cleanup path that involves a gpio_put(). Therefore it seems
> logical to actually use it in that one extra case that should've used it
> anyway, a few lines above.

The two are entirely unrelated. as I've already explained. The call
to device_register() happens _after_ the check for rval from the
dev_set_name() that you are changing. Moving device_register() doesn't
make the lack of gpiod_put() any better or worse than it was before.

That said, I'm now thinking that my patch is actually wrong, but for
a different reason unrelated to the gpiod issue. :(

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!