Re: UDP-U stream performance regression on 32-rc1 kernel

From: Zhang, Yanmin
Date: Thu Nov 05 2009 - 02:43:45 EST

On Thu, 2009-11-05 at 06:20 +0100, Mike Galbraith wrote:
> On Thu, 2009-11-05 at 10:20 +0800, Zhang, Yanmin wrote:
> > On Wed, 2009-11-04 at 13:07 +0100, Mike Galbraith wrote:
> > > Can you try the below, and send me
> > I tested it on Nehalem machine against the latest tips kernel. netperf loopback
> > result is good and regression disappears.
> Excellent. Ingo has picked up a version in tip (1b9508f) which has zero
> negative effect on my x264 testcase, and is a win for mysql+oltp through
> the whole test spectrum. As that may (dunno, Ingo?) now be considered a
> regression fix, ie candidate for, testing that it does no harm
> to your big machines would be a good thing. (pretty please?:)
I tested the latest tips kernel which includes commit ï1b9508f.
Comparing with 2.6.31, netperf loopback UDP-U-4k has about 2% regression.

sysbench(oltp)+mysql result is pretty good, about 2% improvement than

> > tbench result has no improvement.
> Can you remind me where we stand on tbench?
I run tbench by starting CPU_NUM*2 tbench clients without cpu binding.
Comparing with 2.6.31, tbench has about 6% regression with 2.6.31-rc1 on Nehalem.
Mostly, it's caused by SD_PREFER_LOCAL and Peter already disables the flag for
MC and cpu domains. Your patch disables it for node domain.
With the current tips kernel, tbench has about 3% regression on 1 nahalem, and
less than 1% on another Nehalem.

With pure 2.6.32-rc6 kernel, tbench result has about 3~6% regression on Nehalem
, comparing with 2.6.32-rc5's. So some patches in tips haven't been merged into

> > > your UDP-U-1k args so I can try it?
> > #taskset -c 0 ./netserver
> > #taskset -c 15 ./netperf -t UDP_STREAM -l 60 -H -i 50 3 -I 99 5 -- -P 12384,12888 -s 32768 -S 32768 -m 4096
> >
> > Pls. check /proc/cpuinfo to make sure cpu 0 and cpu 15 are not in the
> > same physical cpu.
> Thanks. My little box doesn't have a 15 (darn) so 0,3 will have to do.
Sorry. I copy it from the output of "ps -ef", so a couple of ',' are lost. The right netperf command
line is:
ï#taskset -c 15 ./netperf -t UDP_STREAM -l 60 -H -i 50,3 -I 99,5 -- -P 12384,12888 -s 32768 -S 32768 -m 4096

> > I also run sysbench(oltp)+mysql testing with thread number 14,16,18,20,32,64,128. The average
> > number is good. If I compare every single result against 2.6.32-rc5's, I find thread number
> > ï14,16,18,20,32's result are better than 2.6.32-rc5's, but 64,128's result are worse. 128's is
> > the worst.
> Hm. That's disconcerting. However, that patch isn't going anywhere but
> to the bitwolf anyway (diagnostic). If 1b9508f regresses, that will be
> a problem. With diag, my box also regressed at the tail. Balancing a
> bit seems to help mysql once it starts tripping all over itself, it
> improves the decay curve markedly. 1b9508f does brief bursts of newidle
> balancing when idle time climbs, which translated to a ~6% improvement
> at 256 clients on my little quad.
> -Mike

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at