Re: [RFC PATCH 0/9] livepatch: consistency model

From: Josh Poimboeuf
Date: Mon Feb 09 2015 - 22:06:04 EST


On Tue, Feb 10, 2015 at 12:15:21AM +0100, Jiri Kosina wrote:
> On Mon, 9 Feb 2015, Josh Poimboeuf wrote:
> > This patch set implements a livepatch consistency model, targeted for 3.21.
> > Now that we have a solid livepatch code base, this is the biggest remaining
> > missing piece.
>
> Hi Josh,
>
> first, thanks a lot for putting this together. From a cursory look it
> certainly seems to be a very solid base for future steps.
>
> I am afraid I won't get to proper review before merge window concludes
> though. But after that it gets moved the top of my TODO list.

No problem. Sorry for the inconvenient timing...

> > This code stems from the design proposal made by Vojtech [1] in November. It
> > makes live patching safer in general. Specifically, it allows you to apply
> > patches which change function prototypes. It also lays the groundwork for
> > future code changes which will enable data and data semantic changes.
> >
> > It's basically a hybrid of kpatch and kGraft, combining kpatch's backtrace
> > checking with kGraft's per-task consistency. When patching, tasks are
> > carefully transitioned from the old universe to the new universe. A task can
> > only be switched to the new universe if it's not using a function that is to be
> > patched or unpatched. After all tasks have moved to the new universe, the
> > patching process is complete.
> >
> > How it transitions various tasks to the new universe:
> >
> > - The stacks of all sleeping tasks are checked. Each task that is not sleeping
> > on a to-be-patched function is switched.
> >
> > - Other user tasks are handled by do_notify_resume() (see patch 9/9). If a
> > task is I/O bound, it switches universes when returning from a system call.
> > If it's CPU bound, it switches when returning from an interrupt.
>
> Just one rather minor comment to this -- we can actually switch CPU-bound
> processess "immediately" when we notice they are running in userspace
> (assuming that we are also migrating them when they are entering the
> kernel as well ... which doesn't seem to be implemented by this patchset,
> but that could be easily added at low cost).

We could, but I guess the trick is figuring out how to tell if the task
is in user space. But anyway, I don't really see why it would be
necessary.

> Relying on IRQs is problematic, because you can have CPU completely
> isolated from both scheduler and IRQs (that's what realtime folks are
> doing routinely), so you don't see IRQ on that particular CPU for ages.

It doesn't _rely_ on IRQs, it's just another tool in the kit to help
tasks converge quickly. The front line of attack is backtrace checking
of sleeping tasks. Then it uses system call switching and IRQs as the
next wave of attack, with signals as the last resort. So you can still
fall back on sending signals if needed.

> The way how do detect whether given CPU is running in userspace (without
> interfering with it too much, like, say, sending costly IPI) is rather
> tricky though. On kernels with CONFIG_CONTEXT_TRACKING we could make use
> of that feature, but my gut feeling is that most people keep that
> disabled.

Yeah, that seems to be related to nohz. I think we'd have to have it
enabled 100% of the time on all CPUs, even when not patching. Sounds
like a lot of unnecessary overhead (unless the user already has it
enabled on all CPUs).

> Another alternative is what we are doing in kgraft with
> kgr_needs_lazy_migration(), but admittedly that's very far from being
> pretty.

Hm, is it really safe to read a stack while the task could be writing to
it?

--
Josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/