Re: [PATCH 0/5 V5] Avoid soft lockup message when KVM is stopped byhost

From: Eric B Munson
Date: Thu Dec 08 2011 - 10:19:36 EST


On Wed, 07 Dec 2011, Avi Kivity wrote:

> On 12/05/2011 10:18 PM, Eric B Munson wrote:
> > Changes from V4:
> > Rename KVM_GUEST_PAUSED to KVMCLOCK_GUEST_PAUSED
> > Add description of KVMCLOCK_GUEST_PAUSED ioctl to api.txt
> >
> > Changes from V3:
> > Include CC's on patch 3
> > Drop clear flag ioctl and have the watchdog clear the flag when it is reset
> >
> > Changes from V2:
> > A new kvm functions defined in kvm_para.h, the only change to pvclock is the
> > initial flag definition
> >
> > Changes from V1:
> > (Thanks Marcelo)
> > Host code has all been moved to arch/x86/kvm/x86.c
> > KVM_PAUSE_GUEST was renamed to KVM_GUEST_PAUSED
> >
> > When a guest kernel is stopped by the host hypervisor it can look like a soft
> > lockup to the guest kernel. This false warning can mask later soft lockup
> > warnings which may be real. This patch series adds a method for a host
> > hypervisor to communicate to a guest kernel that it is being stopped. The
> > final patch in the series has the watchdog check this flag when it goes to
> > issue a soft lockup warning and skip the warning if the guest knows it was
> > stopped.
> >
> > It was attempted to solve this in Qemu, but the side effects of saving and
> > restoring the clock and tsc for each vcpu put the wall clock of the guest behind
> > by the amount of time of the pause. This forces a guest to have ntp running
> > in order to keep the wall clock accurate.
>
> Having this controlled from userspace means it doesn't work for SIGSTOP
> or for long scheduling delays. What about doing this automatically
> based on preempt notifiers?
>
>
> --
> error compiling committee.c: too many arguments to function
>
My concern for preempt notifiers is masking real soft lockup warnings. If the
flag is set every time the vm is preempted, it becomes more likely that we will
mask real warnings. The ioctl was choosen because it sets the flag only when
the guest is being paused deliberately.

AFAIK, SIGSTOP is not a supported way to stop a qemu vm so a soft lockup
warning would be working as designed there. If that isn't the case, or if it
ever changes, we could always install a signal handler for SIGCONT that set the
flag before resuming the vm.

Scheduling delays are also beyond the scope of this problem and I see the soft
lockup warning as appropriate in that case.

Eric

Attachment: signature.asc
Description: Digital signature