Re: [PATCH 1/2] zram: fix crashes due to use of cpu hotplug multistate

From: Greg KH
Date: Thu Apr 08 2021 - 02:16:19 EST


On Thu, Apr 08, 2021 at 03:37:53AM +0200, Thomas Gleixner wrote:
> Greg,
>
> On Fri, Apr 02 2021 at 09:54, Greg KH wrote:
> > On Thu, Apr 01, 2021 at 11:59:25PM +0000, Luis Chamberlain wrote:
> >> As for the syfs deadlock possible with drivers, this fixes it in a generic way:
> >>
> >> commit fac43d8025727a74f80a183cc5eb74ed902a5d14
> >> Author: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> >> Date: Sat Mar 27 14:58:15 2021 +0000
> >>
> >> sysfs: add optional module_owner to attribute
> >>
> >> This is needed as otherwise the owner of the attribute
> >> or group read/store might have a shared lock used on driver removal,
> >> and deadlock if we race with driver removal.
> >>
> >> Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
> >
> > No, please no. Module removal is a "best effort", if the system dies
> > when it happens, that's on you. I am not willing to expend extra energy
> > and maintance of core things like sysfs for stuff like this that does
> > not matter in any system other than a developer's box.
> >
> > Lock data, not code please. Trying to tie data structure's lifespans
> > to the lifespan of code is a tangled mess, and one that I do not want to
> > add to in any form.
> >
> > sorry,
>
> Sorry, but you are fundamentaly off track here. This has absolutely
> nothing to do with module removal.
>
> The point is that module removal is the reverse operation of module
> insertion. So far so good.
>
> But module insertion can fail. So if you have nested functionalities
> which hang off or are enabled by moduled insertion then any fail in that
> sequence has to be able to roll back and clean up properly no matter
> what.
>
> Which it turn makes modules removal a reverse operation of module
> insertion.
>
> If you think otherwise, then please provide a proper plan how nested
> operations like sysfs - not to talk about more complex things like multi
> instance discovery which can happen inside a module insertion sequence
> can be properly rolled back.
>
> Just declaring that rmmod is evil does not cut it. rmmod is the least of
> the problems. If that fails, then a lot of rollback, failure handling
> mechanisms are missing in the setup path already.
>
> Anything which cannot cleanly rollback no matter whether the fail or
> rollback request happens at insertion time or later is broken by design.
>
> So either you declare module removal as disfunctional or you stop making
> up semantically ill defined and therefore useless claims about it.
>
> Your argument in:
>
> https://lore.kernel.org/linux-block/YGbNpLKXfWpy0ZZa@xxxxxxxxx/
>
> "Lock data, not code please. Trying to tie data structure's lifespans
> to the lifespan of code is a tangled mess, and one that I do not want to
> add to in any form"
>
> is just useless blurb because the fundamental purpose of discovery code
> is to create the data structures which are tied to the code which is
> associated to it.
>
> Please stop this 'module removal' is not supported nonsense unless you
> can prove a complete indepenence of module init/discovery code to
> subsequent discovered entities depending on it.

Ok, but to be fair, trying to add the crazy hacks that were being
proposed to sysfs for something that is obviously not a code path that
can be taken by a normal user or operation is just not going to fly.

Removing a module from a system has always been "let's try it and see!"
type of operation for a very long time. While we try our best, doing
horrible hacks for this rare type of thing are generally not considered
a good idea which is why I said that.

thanks,

greg k-h