Re: [patch 4/5] sched: RCU sched domains

From: Nick Piggin
Date: Wed Apr 06 2005 - 03:04:01 EST


Ingo Molnar wrote:
* Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:


4/5


One of the problems with the multilevel balance-on-fork/exec is that it needs to jump through hoops to satisfy sched-domain's locking semantics (that is, you may traverse your own domain when not preemptable, and you may traverse others' domains when holding their runqueue lock).

balance-on-exec had to potentially migrate between more than one CPU before finding a final CPU to migrate to, and balance-on-fork needed to potentially take multiple runqueue locks.

So bite the bullet and make sched-domains go completely RCU. This actually simplifies the code quite a bit.

Signed-off-by: Nick Piggin <nickpiggin@xxxxxxxxxxxx>


i like it conceptually, so:

Acked-by: Ingo Molnar <mingo@xxxxxxx>


Oh good, thanks.

from now on, all domain-tree readonly uses have to be rcu_read_lock()-ed (or otherwise have to be in a non-preemptible section). But there's a bug in show_shedstats() which does a for_each_domain() from within a preemptible section. (It was a bug with the current hotplug logic too i think.)


Ah, thanks. That looks like a bug in the code with the locking
we have now too...

At a minimum i think we need the fix+comment below.


Well if we say "this is actually RCU", then yes. And we should
probably change the preempt_{dis|en}ables in other places to
rcu_read_lock.

OTOH, if we say we just want all running threads to process through
a preemption stage, then this would just be a preempt_disable/enable
pair.

In practice that makes no difference yet, but it looks like you and
Paul are working to distinguish these two cases in the RCU code, to
accomodate your low latency RCU stuff?

I'd prefer the latter (ie. just disable preempt, and use
synchronize_sched), but I'm not too sure of what is going on with
your the low latency RCU work...?

Ingo

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>


Thanks for catching that. I may just push it through first as a fix
to the current 2.6 schedstats code (using preempt_disable), and
afterwards we can change it to rcu_read_lock if that is required.

--
SUSE Labs, Novell Inc.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/