Re: x86: clean up smpboot.c's use of udelay+schedule

From: Paul E. McKenney
Date: Wed Feb 01 2012 - 19:35:19 EST


On Tue, Jan 31, 2012 at 02:01:56PM +0100, Peter Zijlstra wrote:
> On Tue, 2012-01-31 at 13:53 +0100, Ingo Molnar wrote:
> > Wanna give a short TODO list to anyone wanting to work on that?
>
> I paged out most details again, but it goes something like:
>
> - read and understand the current generic code
>
> - and all architecture code, at which point you'll probably boggle
> at all the similarities that are all subtly different (there's
> about 3 actually different ways in the arch code).
>
> - pick one, preferably one that keeps additional state and doesn't
> fully rely on the online bits and pull it into generic code and
> provide a small vector of arch specific functions.
>
> - convert all archs over.
>
>
> Also related:
>
> - figure out why cpu_down needs kstopmachine, I'm not sure it does..
> we should be able to tear down a cpu using synchronize_sched() and a
> single stop_one_cpu(). (someday when there's time I might actually
> try to implement this).

Currently, a number of the CPU_DYING notifiers assume that they are
running in stop-machine context, including those of RCU.

However, this is not an inherent property of RCU -- DYNIX/ptx's
CPU-offline process did not stop the whole machine, after all, and RCU
(we called it rclock, but whatever) was happy with this arrangement.
In fact, if the outgoing CPU could be made to stop in that context
instead of returning to the scheduler and the idle loop, it would make
my life a bit easier.

My question is why aren't the notifiers executed in the opposite
order going down and coming up, with the coming-up order matching the
boot order? Also, why can't the CPU's exit from this world be driven
out of the idle loop? That way, the CPU wouldn't mark itself offline
(thus in theory to be ignored by CPU), and then immediately dive into
the scheduler and who knows what all else, using RCU all the time. ;-)

(RCU handles this by keeping a separate set of books for online CPUs.
It considers a CPU online at CPU_UP_PREPARE time, and doesn't consider
it offline until CPU_DEAD time. To handle the grace periods between,
force_quiescent_state() allows the grace period to run a few jiffies
before checking cpu_online_map, which allows a given CPU to safely use
RCU for at least one jiffy before marking itself online and for at least
one jiffy after marking itself offline.)

Yet another question is about races between CPU-hotplug events and
registering/unregistering cpu notifiers. I don't believe that the
current code does what you would like in all cases. The only way
I can imagine it really working would be to use generation numbers,
so that once a CPU-hotplug event started, it would invoke only those
notifiers marked with the generation that was in effect when the
event started, or with some earlier generation.

Hey, you asked!!!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/