Re: [PATCH] hv_balloon: Add the support of hibernation

From: David Hildenbrand
Date: Fri Sep 13 2019 - 03:46:11 EST


On 12.09.19 21:18, Dexuan Cui wrote:
>> From: David Hildenbrand <david@xxxxxxxxxx>
>> Sent: Thursday, September 12, 2019 3:09 AM
>> On 12.09.19 01:36, Dexuan Cui wrote:
>>> When hibernation is enabled, we must ignore the balloon up/down and
>>> hot-add requests from the host, if any.
>>
>> Why do you even care about supporting hibernation? Can't you just pause
>> the VM in the hypervisor and continue to live a happy life? :)
>>
>> (to be more precise, most QEMU/KVM distributions I am aware of don't
>> support suspend/hibernation of guests for said reason, so I wonder why
>> Hyper-V needs it)
>
> In some scenarios, hibernation can be better than pause/unpause,
> save/restore and live migration:
>
> 1. Compared to pause/unpause, the VM can power off completely with
> hibernation, and all the states are saved inside the VM image, then the
> image can be copied to another host to start the VM again, as long as
> the new host uses exactly the same configuration for the VM.

Okay, under QEMU that also works just fine via pause/unpause (e.g.,
simply migration).

>
> 2. Compared to pause/unpause, hibernation may be more reliable, since it's
> performed by the VM kernel rather than the host, so the VM kernel may
> better tackle some clock-source/event-sensitive issues.

Not sure I agree, but maybe that's a Hyper-V special issue.

>
> 3. Hibernation can be especially useful when we pass through a PCIe device,
> e.g. a NIC, a NVMe controller or a GPU, to the VM, as usually save/restore
> and live migration can not work with this kind of configuration, because
> usually the host doesn't know how to save/restore the state of the PCIe
> device.

Interesting. Under QEMU/KVM (especially for migration), the discussed
solutions I am aware of rather wanted to temporarily unplug the PCI
devices or replace them with some kind of "standby" device temporarily.


Anyhow, would it also be an option for you instead of making the balloon
basically useless in case the virtual ACPI S4 state is enabled to

a) Remember if there was a harmful requests that was processed (memory
add, balloon up, balloon down) - or if the device is *currently* in an
un-hibernatable state. E.g., if somebody inflated the balloon, you can't
hibernate. But if the balloon was deflated again, you can again hibernate.

b) Block hibernation in balloon_suspend() in case the device is in such
an un-hibernatable state.


Then you don't need hv_is_hibernation_supported(). The VM is able to
hibernate as long as Dynamic Memory and Memory Resizing was not used.
This is something that can be documented perfectly well.

--

Thanks,

David / dhildenb