Re: [RFC PATCH 0/2] sched: simplify the select_task_rq_fair()

From: Michael Wang
Date: Thu Jan 24 2013 - 02:15:34 EST


On 01/24/2013 02:51 PM, Mike Galbraith wrote:
> On Thu, 2013-01-24 at 14:01 +0800, Michael Wang wrote:
>
>> I've enabled WAKE flag on my box like you did, but still can't see
>> regression, and I've just tested on a power server with 64 cpu, also
>> failed to reproduce the issue (not compared with virgin yet, but can't
>> see collapse).
>
> I'm not surprised. I'm seeing enough inconsistent crap to come to the
> conclusion that stock scheduler knobs flat can't be used on a largish
> box, they're just too preempt-happy, leading to weird crap.
>
> My 2 missing nodes came back, and the very same kernel that highly
> repeatably collapsed with 2 nodes does not with 4 nodes, and 2 nodes
> does not collapse with only preemption knob tweaking, and that's
> bullshit. Virgin shows instability in the mid-range, make a tiny tweak
> that should have little if any effect there, and that instability
> vanishes entirely. Test runs are not consistent enough boot to boot etc
> etc. Either stock knobs suck on NUMA boxen, or this box is possessed.

Mike, I wonder the reason why change back to the old way make collapse
away may not because there are logical error in new balance path, it's
just changed the cost of select_task_rq(), whatever it's more or less,
it's accidentally achieve the same effect as you tweak the knob, so
that's the reason why it looks like old is better than new.

Regards,
Michael Wang

>
> -Mike
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/