Re: [RFC -v5 PATCH 0/4] directed yield for Pause Loop Exiting

From: Rik van Riel
Date: Fri Jan 14 2011 - 16:29:32 EST


On 01/14/2011 03:02 AM, Rik van Riel wrote:

Benchmark "results":

Two 4-CPU KVM guests are pinned to the same 4 physical CPUs.

Unfortunately, it turned out I was running my benchmark on
only two CPU cores, using two HT threads of each core.

I have re-run the benchmark with the guests bound to 4
different CPU cores, one HT on each core.

One guest runs the AMQP performance test, the other guest runs
0, 2 or 4 infinite loops, for CPU overcommit factors of 0, 1.5
and 4.

The AMQP perftest is run 30 times, with 8 and 16 threads.

8thr no overcommit 1.5x overcommit 2x overcommit

no PLE 224934 139311 94216.6
PLE 226413 142830 87836.4

16thr no overcommit 1.5x overcommit 2x overcommit

no PLE 224266 134819 92123.1
PLE 224985 137280 100832

The other conclusions hold - it looks like this test is
doing more to expose issues with the scheduler, than
testing the PLE code.

I have some ideas on how to improve yield(), so it can
do the right thing even in the presence of cgroups.

Note: there seems to be something wrong with CPU balancing,
possibly related to cgroups. The AMQP guest only got about
80% CPU time (of 400% total) when running with 2x overcommit,
as opposed to the expected 200%. Without PLE, the guest
seems to get closer to 100% CPU time, which is still far
below the expected.

Unfortunately, it looks like this test ended up more as a
demonstration of other scheduler issues, than as a performance
test of the PLE code.

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/