Re: [PATCH 1/3] sched: add sched_task_call()

From: Josh Poimboeuf
Date: Wed Feb 18 2015 - 12:13:32 EST

Next message: tip-bot for Markus Elfring: "[tip:perf/core] perf/x86/intel/uncore: Delete an unnecessary check before pci_dev_put() call"
Previous message: tip-bot for Davidlohr Bueso: "[tip:locking/core] locking/rwsem: Check for active lock before bailing on spinning"
In reply to: Peter Zijlstra: "Re: [PATCH 1/3] sched: add sched_task_call()"
Next in thread: Peter Zijlstra: "Re: [PATCH 1/3] sched: add sched_task_call()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, Feb 18, 2015 at 04:21:00PM +0100, Peter Zijlstra wrote:
> On Tue, Feb 17, 2015 at 03:25:32PM -0600, Josh Poimboeuf wrote:
> > > And I'm assuming you're hard relying on CONFIG_FRAMEPOINTER here,
> > > because otherwise x86 stacks are a mess too.
> >
> > Yeah, it'll rely on CONFIG_FRAME_POINTER. IIUC, the arches we care
> > about now (x86, power, s390, arm64) all have frame pointer support, and
> > the stack formats are all sane, AFAICT.
> >
> > If we ever port livepatch to a more obscure arch for which walking the
> > stack is more difficult, we'll have several options:
> >
> > a) spend the time to ensure the unwinding code is correct and resilient
> > to errors;
> >
> > b) leave the consistency model compiled code out if !FRAME_POINTER and
> > allow users to patch without one (similar to the livepatch code
> > that's already in the livepatch tree today); or
>
> Which has a much more limited set of patches it can do, right?

If we're talking security patches, roughly 9 out of 10 of them don't
change function prototypes, data, or data semantics. If the patch
author is careful, he or she doesn't need a consistency model in those
cases.

So it's not _overly_ limited. If somebody's not happy about those
limitations, they can put in the work for option a :-)

> > > And then hope you can get a better trace next time around? Or will you
> > > fall-back to an alternative method of patching?
> >
> > Yeah, on second thought, we wouldn't have to cancel the patch. We could
> > defer to check the task's stack again at a later time. If it's stuck
> > there, the user can try sending it a signal to unstick it, or cancel the
> > patching process. Those mechanisms are already in place with my
> > consistency model patch set.
> >
> > I'd also do a WARN_ON_ONCE and a dump of the full stack data, since I'm
> > guessing it would either indicate an error in the unwinding code or
> > point us to an unexpected stack condition.
>
> So uhm, what happens if your target task is running? When will you
> retry? The problem I see is that if you do a sample approach you might
> never hit an opportune moment.

We attack it from multiple angles.

First we check the stack of all sleeping tasks. That patches the
majority of tasks immediately. If necessary, we also do that
periodically in a workqueue to catch any stragglers.

The next line of attack is patching tasks when exiting the kernel to
user space (system calls, interrupts, signals), to catch all CPU-bound
and some I/O-bound tasks. That's done in patch 9 [1] of the consistency
model patch set.

As a last resort, if there are still any tasks which are sleeping on a
to-be-patched function, the user can send them SIGSTOP and SIGCONT to
force them to be patched.

If none of those work, the user has the option of either canceling the
patch or just waiting to see if the patching process eventually
completes.

> > > I'm further thinking we don't actually need 2 (or more) different means
> > > of live patching in the kernel. So you all had better sit down (again)
> > > and come up with something you all agree on.
> >
> > Yeah, I also _really_ want to avoid multiple consistency models.
> >
> > In fact, that's a big motivation behind my consistency model patch set.
> > It's heavily inspired by a suggestion from Vojtech [1]. It combines
> > kpatch (backtrace checking) with kGraft (per-thread consistency). It
> > has several advantages over either of the individual approaches. See
> > http://lwn.net/Articles/632582/ where I describe its pros over both
> > kpatch and kGraft.
> >
> > Jiri, would you and Vojtech agree that the proposed consistency model is
> > all we need? Or do you still want to do the multiple consistency model
> > thing?
>
> Skimmed that thread; you all mostly seem to agree that one would be good
> but not quite agree on which one.

Well, I think we at least agreed that LEAVE_PATCHED_SET + SWITCH_THREAD
is the most interesting combination. That's what my patches implement.
Let's wait to see what Jiri and Vojtech say.

> And I note, not all seem to require this stack walking stuff.

True. But without stack walking, you have bigger problems:

1. You have to signal _all_ sleeping tasks to force them to be patched.
That's much more disruptive to the system.

2. There's no way to convert kthreads without either:

a) hacking up all the various kthread loops to call
klp_update_task_universe(); or

b) converting all kthreads' code to be freezeable. Then we could
freeze all tasks to patch them. Not only would it be a lot of
work to make all kthreads freezable, but it would encode within
the kernel the dangerous and non-obvious assumption that the
freeze point is a "safe" place for patching.

[1] https://lkml.org/lkml/2015/2/9/478

--
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: tip-bot for Markus Elfring: "[tip:perf/core] perf/x86/intel/uncore: Delete an unnecessary check before pci_dev_put() call"
Previous message: tip-bot for Davidlohr Bueso: "[tip:locking/core] locking/rwsem: Check for active lock before bailing on spinning"
In reply to: Peter Zijlstra: "Re: [PATCH 1/3] sched: add sched_task_call()"
Next in thread: Peter Zijlstra: "Re: [PATCH 1/3] sched: add sched_task_call()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]