Re: [PATCH RFC 1/2] kvm: Handle undercommitted guest case in PLEhandler

From: Raghavendra K T
Date: Mon Oct 15 2012 - 08:14:30 EST


On 10/11/2012 01:06 AM, Andrew Theurer wrote:
On Wed, 2012-10-10 at 23:24 +0530, Raghavendra K T wrote:
On 10/10/2012 08:29 AM, Andrew Theurer wrote:
On Wed, 2012-10-10 at 00:21 +0530, Raghavendra K T wrote:
* Avi Kivity <avi@xxxxxxxxxx> [2012-10-04 17:00:28]:

On 10/04/2012 03:07 PM, Peter Zijlstra wrote:
On Thu, 2012-10-04 at 14:41 +0200, Avi Kivity wrote:

[...]
A big concern I have (if this is 1x overcommit) for ebizzy is that it
has just terrible scalability to begin with. I do not think we should
try to optimize such a bad workload.


I think my way of running dbench has some flaw, so I went to ebizzy.
Could you let me know how you generally run dbench?

I mount a tmpfs and then specify that mount for dbench to run on. This
eliminates all IO. I use a 300 second run time and number of threads is
equal to number of vcpus. All of the VMs of course need to have a
synchronized start.

I would also make sure you are using a recent kernel for dbench, where
the dcache scalability is much improved. Without any lock-holder
preemption, the time in spin_lock should be very low:


21.54% 78016 dbench [kernel.kallsyms] [k] copy_user_generic_unrolled
3.51% 12723 dbench libc-2.12.so [.] __strchr_sse42
2.81% 10176 dbench dbench [.] child_run
2.54% 9203 dbench [kernel.kallsyms] [k] _raw_spin_lock
2.33% 8423 dbench dbench [.] next_token
2.02% 7335 dbench [kernel.kallsyms] [k] __d_lookup_rcu
1.89% 6850 dbench libc-2.12.so [.] __strstr_sse42
1.53% 5537 dbench libc-2.12.so [.] __memset_sse2
1.47% 5337 dbench [kernel.kallsyms] [k] link_path_walk
1.40% 5084 dbench [kernel.kallsyms] [k] kmem_cache_alloc
1.38% 5009 dbench libc-2.12.so [.] memmove
1.24% 4496 dbench libc-2.12.so [.] vfprintf
1.15% 4169 dbench [kernel.kallsyms] [k] __audit_syscall_exit


Hi Andrew,
I ran the test with dbench with tmpfs. I do not see any improvements in
dbench for 16k ple window.

So it seems apart from ebizzy no workload benefited by that. and I
agree that, it may not be good to optimize for ebizzy.
I shall drop changing to 16k default window and continue with other
original patch series. Need to experiment with latest kernel.

(PS: Thanks for pointing towards, perf in latest kernel. It works fine.)

Results:
dbench run for 120 sec 30 sec warmup 8 iterations using tmpfs
base = 3.6.0-rc5 with ple handler optimization patch.

x => base + ple_window = 4k
+ => base + ple_window = 16k
* => base + ple_gap = 0

dbench 1x overcommit case
=========================
N Min Max Median Avg Stddev
x 8 5322.5 5519.05 5482.71 5461.0962 63.522276
+ 8 5255.45 5530.55 5496.94 5455.2137 93.070363
* 8 5350.85 5477.81 5408.065 5418.4338 44.762697


dbench 2x overcommit case
==========================

N Min Max Median Avg Stddev
x 8 3054.32 3194.47 3137.33 3132.625 54.491615
+ 8 3040.8 3148.87 3088.615 3088.1887 32.862336
* 8 3031.51 3171.99 3083.6 3097.4612 50.526977

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/