weird interaction between kvm and NO_HZ_FULL?

From: Chris Friesen
Date: Fri Mar 20 2015 - 12:18:32 EST


Hi,

I'm running 3.10 (yeah, I know) and I'm playing with CONFIG_NO_HZ_FULL. I'm getting a strange result where some CPUs are able to turn off local timer interrupts and others aren't.

Is there a known interaction between kvm-based VMs and CONFIG_NO_HZ_FULL?

Background:

I've got an x86-64 system with 16 cores. I the kernel has boot args "isolcpus=1-15 rcu_nocbs=1-15 nohz_full=1-15".

I have all system tasks running on CPU 0, then a couple of busy-looping CPU hogs (DPDK apps) affined to CPUs 1 and 2 respectively. Then I have a 3-vCPU kvm-based VM running on CPUs 3/4/5. (Each vCPU is affined to a single host CPU.)

Within the VM, vCPU0 is running system tasks and is mostly idle, while vCPUs 1/2 are running busy-looping CPU hogs.


Current issue:

Looking at the local timer interrupts over 10 seconds CPUs 1/2 incremented by about 25, CPU 3 (vCPU0 in the guest, mostly idle) incremented by 57000, CPUs 4/5 (which are busy-looping in the guest) incremented by 10000, and the other CPUs increased by 2. This is fairly reproducible.

Looking at the sched ftrace logs over 10 seconds:

On CPU 1 I see it running vswitch, rcuc/1-211, and ksoftirqd/1-212.
On CPU 5 I see it running kvm, rcuc/5-235, and ksoftirqd/5-236
On CPU 3 I see it running kvm-29634, kvm-29637, and (mostly) the idle task

In all cases there doesn't seem to be significant contention. For each of CPUs 1/5 there are under 60 lines of trace output over 10 seconds.

Connecting via strace to the kvm thread on CPU 5 it seemed to be doing almost entirely userspace processing, with no syscalls in multiple seconds.

Just for fun I ran "cat /dev/zero > /dev/null" on CPU 9 and the interrupt rate remained low though I could see it chewing all the CPU time.

I'm at a loss to explain why the timer ticks aren't being suppressed as expected on CPUs 3/4/5. Does anyone have any ideas? Is kvm doing something "odd" to mess it up?

Thanks,
Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/