Re: RT sched: cpupri_vec lock contention with def_root_domain and no load balance

From: Dimitri Sivanich
Date: Mon Nov 03 2008 - 20:29:45 EST


On Mon, Nov 03, 2008 at 11:33:23PM +0100, Peter Zijlstra wrote:
> On Mon, 2008-11-03 at 15:07 -0600, Dimitri Sivanich wrote:
> > When load balancing gets switched off for a set of cpus via the
> > sched_load_balance flag in cpusets, those cpus wind up with the
> > globally defined def_root_domain attached. The def_root_domain is
> > attached when partition_sched_domains calls detach_destroy_domains().
> > A new root_domain is never allocated or attached as a sched domain
> > will never be attached by __build_sched_domains() for the non-load
> > balanced processors.
> >
> > The problem with this scenario is that on systems with a large number
> > of processors with load balancing switched off, we start to see the
> > cpupri->pri_to_cpu->lock in the def_root_domain becoming contended.
> > This starts to become much more apparent above 8 waking RT threads
> > (with each RT thread running on it's own cpu, blocking and waking up
> > continuously).
> >
> > I'm wondering if this is, in fact, the way things were meant to work,
> > or should we have a root domain allocated for each cpu that is not to
> > be part of a sched domain? Note the the def_root_domain spans all of
> > the non-load-balanced cpus in this case. Having it attached to cpus
> > that should not be load balancing doesn't quite make sense to me.
>
> It shouldn't be like that, each load-balance domain (in your case a
> single cpu) should get its own root domain. Gregory?
>
> > Here's where we've often seen this lock contention occur:
>
> what's this horrible output from?

This output is a stack backtrace from KDB. KDB entry is triggered after too much time elapses prior to thread wakeup. The traces pointed to this lock. Too further test that theory, we hacked up a change to create root_domain's for each cpu and the max thread wakeup times improved.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/