Re: [RFC PATCH 0/9] livepatch: consistency model

From: Jiri Kosina
Date: Tue Feb 10 2015 - 02:21:43 EST


On Mon, 9 Feb 2015, Josh Poimboeuf wrote:

> > The way how do detect whether given CPU is running in userspace
> > (without interfering with it too much, like, say, sending costly IPI)
> > is rather tricky though. On kernels with CONFIG_CONTEXT_TRACKING we
> > could make use of that feature, but my gut feeling is that most people
> > keep that disabled.

> Yeah, that seems to be related to nohz. I think we'd have to have it
> enabled 100% of the time on all CPUs, even when not patching. Sounds
> like a lot of unnecessary overhead (unless the user already has it
> enabled on all CPUs).

Agreed, we could make use of it when it's enabled in kernel config anyway,
but it would be impractical for us to hard require it.

> > Another alternative is what we are doing in kgraft with
> > kgr_needs_lazy_migration(), but admittedly that's very far from being
> > pretty.
>
> Hm, is it really safe to read a stack while the task could be writing to
> it?

It might indeed look like that on a first sight :) but let's look at the
possible race scenarios:

(1) task is running in userspace when you start looking at its kernel
stack, and while you are examining it, it enters the kernel. That's
not a problem, because no matter what verdict kgr_needs_lazy_migration()
yields, the migration to new universe happens during kernel entry
anyway

(2) task is actively running in kernelspace. There is no way for
print_context_stack() to result it that small number of nr_entries.
The stack context might be bogus due to the race, but it always
starts at a valid bp which can't be that low.

(3) task is running in kernelspace, but is about to exit to userspace, and
looking at the kernel stack races with this. That's again not a
problem, because no matter what verdict kgr_needs_lazy_migration()
yields, the migration to the new unuverse happens during kernel exit
anyway

So I agree that this is ugly as hell, and depends on architecture-specific
implementation of print_context_stack(); but architectures are free to
give up this optimization if it can't be used.

But yes, we should be able to come up with something better if we want to
use this optimization upstream.

Thanks,

--
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/