Re: [RFC PATCH] livepatch: allow removal of a disabled patch

From: Miroslav Benes
Date: Wed May 04 2016 - 10:35:38 EST


On Wed, 4 May 2016, Josh Poimboeuf wrote:

> On Wed, May 04, 2016 at 01:58:47PM +0200, Miroslav Benes wrote:
> > On Tue, 3 May 2016, Josh Poimboeuf wrote:
> >
> > > On Tue, May 03, 2016 at 09:39:48PM -0500, Josh Poimboeuf wrote:
> > > > On Wed, May 04, 2016 at 12:31:12AM +0200, Jiri Kosina wrote:
> > > > > On Tue, 3 May 2016, Josh Poimboeuf wrote:
> > > > >
> > > > > > > 1. Do we really need a completion? If I am not missing something
> > > > > > > kobject_del() always waits for sysfs callers to leave thanks to kernfs
> > > > > > > active protection.
> > > > > >
> > > > > > What do you mean by "kernfs active protection"? I see that
> > > > > > kernfs_remove() gets the kernfs_mutex lock, but I can't find anywhere
> > > > > > that a write to a sysfs file uses that lock.
> > > > > >
> > > > > > I'm probably missing something...
> > > > >
> > > > > I don't want to speak on Miroslav's behalf, but I'm pretty sure that what
> > > > > he has on mind is per-kernfs_node active refcounting kernfs does (see
> > > > > kernfs_node->active, and especially it's usage in __kernfs_remove()).
> > > > >
> > > > > More specifically, execution of store() and show() sysfs callbacks is
> > > > > guaranteed (by kernfs) to happen with that particular attribute's active
> > > > > reference held for reading (and that makes it impossible for that
> > > > > attribute to vanish prematurely).
> > > >
> > > > Thanks, that makes sense.
> > > >
> > > > So what exactly is the problem the completion is trying to solve? Is it
> > > > to ensure that the kobject has been cleaned up before it returns to the
> > > > caller, in case the user wants to call klp_register() again after
> > > > unregistering?
> > > >
> > > > If so, that's quite an unusual use case which I think we should just
> > > > consider unsupported. In fact, if you try to do it, kobject_init()
> > > > complains loudly because kobj->state_initialized is still 1 because
> > > > kobjects aren't meant to be reused like that.
> > >
> > > ... and now I realize the point is actually to prevent the caller from
> > > freeing klp_patch before kobject_cleanup() runs.
> >
> > Exactly. Sorry I was so brief.
> >
> > > So yeah, it looks like we need the completion in case
> > > CONFIG_DEBUG_KOBJECT_RELEASE is enabled.
> > >
> > > Or alternatively we could convert patch->kobj to be dynamically
> > > allocated instead of embedded in klp_patch.
> >
> > But that wouldn't help, would it? Dynamic kobjects registers generic
> > release function dynamic_kobj_release() and that's it. We're in the same
> > situation. I have got a feeling that dynamic kobjects are only for trivial
> > cases.
>
> But the patch release doesn't need to do anything, right?

That is correct. I wanted to point out that dynamic_kobj_release() did not
really solve our "completion" issue. If there is a problem in our code and
we need completion, dynamic kobjects would not help. If we don't need a
completion at all we could move to dynamic kobjects.

There is still container_of() though.

> > Moreover we use container_of() several times in the code and that does not
> > work with dynamically allocated kobjects.
> >
> > Anyway I am really confused now. When I read changelog of c817a67ecba7
> > ("kobject: delayed kobject release: help find buggy drivers") all makes
> > perfect sense. But isn't our situation somewhat special, because we have
> > refcounts completely under control? So we know that once we call
> > kobject_put() we can let a patch go... I must be missing something.
> >
> > It does not make sense to introduce completion just to satisfy a feature
> > which was introduced to debug general cases.
>
> I think our situation is "special" because klp_patch and its embedded
> kobject have separate lifetimes. We have a kobject, which we own, which
> is embedded in a klp_patch struct, which we don't own.
>
> If I understand correctly, normally when you release a kobject, the
> containing struct gets freed. But we can't do that here because the
> caller allocated the klp_patch.

Normally only struct kobject is freed, no?

>From Documentation/kobject.txt:

"One important point cannot be overstated: every kobject must have a
release() method, and the kobject must persist (in a consistent state)
until that method is called. If these constraints are not met, the code is
flawed."

So we need to only make sure that klp_patch does not disappear before
calling release() method. In our case that cannot happen even without
completion, because we call kobject_put() in klp_unregister_patch() when
the patch module is going and kobject_put() calls our release (potentially
empty) method in a "sync" way. If I read that code correctly.

This does not hold only if CONFIG_DEBUG_KOBJECT_RELEASE=y.

> So I really get the feeling that a dynamic kobject would be more
> appropriate here.

See above.

> That said, the sysfs and kobject stuff always throws me for a loop. So
> take what I'm saying with several grains of salt.

Tell me about it.

Miroslav