Re: [PATCH 3/3] livepatch: force transition process to finish

From: Josh Poimboeuf
Date: Fri May 26 2017 - 13:38:14 EST


On Thu, May 25, 2017 at 06:03:07PM +0200, Petr Mladek wrote:
> On Thu 2017-05-25 14:59:55, Miroslav Benes wrote:
> >
> > > > > In fact, I would suggest to take klp_mutex in force_store()
> > > > > and do all actions synchronously, including the check
> > > > > of klp_transition_patch.
> > > >
> > > > I still think it is better not do it. klp_unmark_tasks() does nothing else
> > > > than tasks already do. They call klp_update_patch_state() by themselves
> > > > and they do not grab klp_mutex lock for doing that. klp_unmark_tasks()
> > > > only forces this action.
> > >
> > > You have a point. But I am not convinced ;-) klp_update_patch_state()
> > > was called very carefully only when it was safe. The forcing
> > > intentionally breaks the consistency model. User should really know
> > > what they are doing when they use this feature.
> > >
> > > I think that we should actually taint the kernel. Developers should
> > > know when users were pulling their legs.
> >
> > We could do that. I can change pr_warn() to WARN_ON_ONCE(), which would of
> > course taint the kernel.
>
> Sounds good to me.

I'm thinking that WARN_ON_ONCE() seems too severe. If the patch didn't
need a consistency model in the first place then it wouldn't be worth
warning about.

We have to trust that the user knows what they're doing. And that's
true for the entire live patching process, including patch analysis and
patch creation. And anyway we already have a taint flag for that:
TAINT_LIVEPATCH.

> > > > On the other hand, I do not see a problem in doing that. We already have a
> > > > relationship between klp_mutex and tasklist_lock defined elsewhere, so it
> > > > is safe.
> > >
> > > Yup.
> > >
> > > > It would only serialize things needlessly.
> > >
> > > I do not agree. The speed is not important here. Also look
> > > into klp_reverse_transition(). We explicitly clear all
> > > TIF_PATCH_PENDING flags and call synchronize_rcu() just
> > > to make the situation easier and reduce space for potential
> > > mistakes.
> >
> > Yes, because we had to do that. We ran into problems otherwise. We do not
> > have to do it here. It does not help anything in my opinion.
>
> AFAIK, we did not have to do it, see
> https://lkml.kernel.org/r/20161222143452.GK25166@xxxxxxxxxxxxxxx
> and the comment starting with "It would still leave a small".
>
> Just for record, the idea of disabling the TIF flags came from Josh
> in another mail. I have just repeated it.
>
> I think that the problem already is complex enough and the
> serialization would reduce the space of potential races.
> But it is possible that I see it just too complex here.

IMO we can skip the mutex. The consistency model will be broken anyway,
so all bets are off.

--
Josh