Re: [git pull for -mm] CPU isolation extensions (updated2)

From: Max Krasnyansky
Date: Wed Feb 13 2008 - 00:48:55 EST




Steven Rostedt wrote:
> On Tue, 12 Feb 2008, Peter Zijlstra wrote:
>>> Rusty - Stop machine.
>>> After doing a bunch of testing last three days I actually downgraded stop machine
>>> changes from [highly experimental] to simply [experimental]. Pleas see this thread
>>> for more info: http://marc.info/?l=linux-kernel&m=120243837206248&w=2
>>> Short story is that I ran several insmod/rmmod workloads on live multi-core boxes
>>> with stop machine _completely_ disabled and did no see any issues. Rusty did not get
>>> a chance to reply yet, I hopping that we'll be able to make "stop machine" completely
>>> optional for some configurations.
>
> This part really scares me. The comment that you say you have run several
> insmod/rmmod workloads without kstop_machine doesn't mean that it is still
> safe. A lot of races that things like this protect may only happen under
> load once a month. But the fact that it happens at all is reason to have
> the protection.
>
> Before taking out any protection, please analyze it in detail and report
> your findings why something is not needed. Not just some general hand
> waving and "it doesn't crash on my box".
Sure. I did not say lets disable it. I was hopping we could and I wanted to see what Rusty
Russell has to say about this.

> Besides that, kstop_machine may be used by other features that can have an
> impact.
Yes it is. I missed a few. Nick and Dave already pointed out CPU hotplug.
I looked around and found more users. So disabling stop machine completely is definitely out.

> Again, if you have a system that cant handle things like kstop_machine,
> than don't do things that require a kstop_machine run. All modules should
> be loaded, and no new modules should be added when the system is
> performing critical work. I see no reason for disabling kstop_machine.
I'm considering that option. So far it does not seem practical. At least the way we use those
machines at this point. If we can prove that at least not halting isolation CPUs is safe that'd
be better.

Max
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/