Re: 20% performance drop on PostgreSQL 9.2 from kernel 3.5.3 to3.6-rc5 on AMD chipsets - bisected

From: Mike Galbraith
Date: Wed Sep 19 2012 - 11:23:50 EST


On Wed, 2012-09-19 at 16:54 +0200, Ingo Molnar wrote:
> * Mike Galbraith <efault@xxxxxx> wrote:
>
> > On Sun, 2012-09-16 at 06:35 +0200, Mike Galbraith wrote:
> >
> > > Oh, while I'm thinking about it, there's another scenario
> > > that could cause the select_idle_sibling() change to affect
> > > pgbench on largeish packages, but it boils down to
> > > preemption odds as well. IIRC pgbench _was_ at least 1:N,
> > > ie one process driving the whole load. Waker of many
> > > (singularly bad idea as a way to generate load) being
> > > preempted by it's wakees stalls the whole load, so expensive
> > > spreading of wakees to the four winds ala WAKE_BALANCE
> > > becomes attractive, that pain being markedly less intense
> > > than having multiple cores go idle while creator or work
> > > waits for one.
> >
> > Enabling SMT on little E5620 box says that's the deal.
> > pgbench as run is 1:N, and all you have to do is disable
> > select_idle_sibling() entirely to see that for _this_ (~odd)
> > load, max spread and lower wakeup latency for the mother of
> > all work itself is a good thing.
> >
> > pgbench -i pgbench && pgbench -c $N -T 10 pgbench
> >
> > N= 1 2 4 8 16 32 64
> > 1336 2482 3752 3485 3327 2928 2290 virgin 3.6.0-rc6
> > 1408 2457 3363 3070 2938 2368 1757 +revert reverted
> > 1310 2492 2487 2729 2186 975 874 +revert + select_idle_sibling() disabled
> > 1407 2505 3422 3137 3093 2828 2250 +revert + schedctl -B /etc/init.d/postgresql restart
> > 1321 2403 2515 2759 2420 2301 1894 +revert + schedctl -B /etc/init.d/postgresql restart + select_idle_sibling() disabled
> >
> > Hohum, damned if ya do, damned if ya don't. Damn.
>
> As a test, could you mark that 'big PostgreSQL central work
> queue process' with some high priority (renice -20?), to make
> sure it's never preempted by wakees? Does that recover
> performance as well?

schedctl -B started postgress SCHED_BATCH, so pgbench won't be preempted
since it's the only SCHED_NORMAL task left in the lot. All others are
postmaster, and SCHED_BATCH.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/