[patch] Re: hackbench regression with kernel 2.6.32-rc1

From: Mike Galbraith
Date: Thu Oct 29 2009 - 05:15:09 EST


On Thu, 2009-10-29 at 14:26 +0800, Zhang, Yanmin wrote:
> On Thu, 2009-10-29 at 06:46 +0100, Mike Galbraith wrote:

> > SD_PREFER_LOCAL is still on in rc1 though (double checks;), so you'll go
> > through the power saving code until you reach a domain containing both
> > waker's cpu and wakee's previous cpu even if that code already found
> > that a higher domain wasn't overloaded. Looks to me like that block
> > wants a want_sd && qualifier.
> >
> > Even it you turn SD_PREFER_LOCAL off, you can still hit the overhead if
> > SD_POWERSAVINGS_BALANCE is set, so I'd make sure both are off and see if
> > that's the source (likely, since the rest is already off).
> Yes. ïSD_POWERSAVINGS_BALANCE is disabled by default. I applied Peter's patch which
> turning ïSD_PREFER_LOCAL off for MC and cpu domain and it doesn't help.
> perf counter shows ïselect_task_rq_fair still consumes about 5% cpu time. Eventually,
> I found for_each_cpu in for_each_domain consumes the 5% cpu time, because Peter's
> patch doesn't turn off ïSD_PREFER_LOCAL for node domain.
> I turned it off for node domain against the latest tips tree and tbench regression
> disappears on a Nehalem machine and becomes about 2% on another one.
>
> Can we turn it off for node domain by default?

If it's hurting fast path overhead to the tune of an order of magnitude,
I guess there's no choice but to either fix it or turn it off. Since
SD_BALANCE_WAKE is off globally, I don't see any point in keeping
SD_PREFER_LOCAL at any level.

(That said, what we need is for this to not be so expensive that we
can't afford it in the fast path).

sched: Disable SD_PREFER_LOCAL at node level.

Yanmin Zhang reported that SD_PREFER_LOCAL induces an order of magnitude
increase in select_task_rq_fair() overhead while running heavy wakeup
benchmarks (tbench and vmark). Since SD_BALANCE_WAKE is off at node level,
turn SD_PREFER_LOCAL off as well pending further investigation.

Signed-off-by: Mike Galbraith <efault@xxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Reported-by: Zhang, Yanmin <yanmin_zhang@xxxxxxxxxxxxxxx>
LKML-Reference: <new-submission>

diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topology.h
index d823c24..40e37b1 100644
--- a/arch/x86/include/asm/topology.h
+++ b/arch/x86/include/asm/topology.h
@@ -143,7 +143,7 @@ extern unsigned long node_remap_size[];
| 1*SD_BALANCE_FORK \
| 0*SD_BALANCE_WAKE \
| 1*SD_WAKE_AFFINE \
- | 1*SD_PREFER_LOCAL \
+ | 0*SD_PREFER_LOCAL \
| 0*SD_SHARE_CPUPOWER \
| 0*SD_POWERSAVINGS_BALANCE \
| 0*SD_SHARE_PKG_RESOURCES \



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/