Re: [RFC PATCH v3 0/3] sched: simplify the select_task_rq_fair()

From: Mike Galbraith
Date: Fri Feb 22 2013 - 07:35:48 EST


On Fri, 2013-02-22 at 13:11 +0100, Ingo Molnar wrote:
> * Mike Galbraith <efault@xxxxxx> wrote:
>
> > On Fri, 2013-02-22 at 10:54 +0100, Ingo Molnar wrote:
> > > * Mike Galbraith <efault@xxxxxx> wrote:
> > >
> > > > On Fri, 2013-02-22 at 09:36 +0100, Peter Zijlstra wrote:
> > > > > On Fri, 2013-02-22 at 10:37 +0800, Michael Wang wrote:
> > > > > > But that's really some benefit hardly to be estimate, especially when
> > > > > > the workload is heavy, the cost of wake_affine() is very high to
> > > > > > calculated se one by one, is that worth for some benefit we could not
> > > > > > promise?
> > > > >
> > > > > Look at something like pipe-test.. wake_affine() used to
> > > > > ensure both client/server ran on the same cpu, but then I
> > > > > think we added select_idle_sibling() and wrecked it again :/
> > > >
> > > > Yeah, that's the absolute worst case for
> > > > select_idle_sibling(), 100% synchronous, absolutely nothing to
> > > > be gained by cross cpu scheduling. Fortunately, most tasks do
> > > > more than that, but nonetheless, select_idle_sibling()
> > > > definitely is a two faced little b*tch. I'd like to see the
> > > > evil b*tch die, but something needs to replace it's pretty
> > > > face. One thing that you can do is simply don't call it when
> > > > the context switch rate is incredible.. its job is to recover
> > > > overlap, if you're scheduling near your max, there's no win
> > > > worth the cost.
> > >
> > > Couldn't we make the cutoff dependent on sched_migration_cost?
> > > If the wakeup comes in faster than that then don't spread.
> >
> > No, that's too high, you loose too much of the pretty face.
> > [...]
>
> Then a logical proportion of it - such as half of it?

Hm. Better would maybe be a quick boot time benchmark, and use some
multiple of your cross core pipe ping-pong time? That we know is a
complete waste of cycles, because almost all cycles are scheduler cycles
with no other work to be done, making firing up another scheduler rather
pointless. If we're approaching that rate, we're approaching bad idea.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/