Re: [PATCH] virtio-balloon spec: provide a version of the "silentdeflate" feature that works

From: Michael S. Tsirkin
Date: Fri Sep 07 2012 - 06:52:14 EST


On Fri, Sep 07, 2012 at 11:27:42AM +0200, Paolo Bonzini wrote:
> Il 07/09/2012 08:39, Rusty Russell ha scritto:
> >> > So it looks like a bug: we should teach driver to tell host first on leak?
> >> > Yan, Vadim, can you comment please?
> >> >
> >> > Also if true, looks like this bit will be useful to detect a fixed driver on
> >> > the hypervisor side - to avoid unmapping such pages? Rusty what do you
> >> > think?
> > So, feature is unimplemented in qemu, and broken in drivers. I starting
> > to share Paolo's dislike of it.
> >
> > Don't understand why we'd care about fixed drivers though, if we remove
> > the feature bit....
>
> Hmm, Michael has a point here. Basically, the Windows driver is using
> silent deflate, but not telling the host (yet) about it. So, we must
> assume that a driver that does not negotiate
> VIRTIO_BALLOON_F_MUST_TELL_HOST _will_ use silent deflate.
>
> Here's a way to proceed.
>
> We add VIRTIO_BALLOON_F_SILENT_DEFLATE, which is negotiated normally.
> If not available, at worst the guest driver may refuse to start, or
> revert to using the deflateq.
>
> We rename VIRTIO_BALLOON_F_MUST_TELL_HOST to WILL_TELL_HOST, since
> that's how it's being used. Now for the device there are three cases:
>
> - does not support silent deflate at all: it should always propose
> VIRTIO_BALLOON_F_WILL_TELL_HOST; if the (bad) driver does not
> negotiate it, the device must assume that the guest will use silent
> deflate, and fail to start the guest if the device does not support
> silent deflate.
>
> - optionally supports silent deflate: it should always propose
> VIRTIO_BALLOON_F_WILL_TELL_HOST; if the (bad) driver does not
> negotiate it, the device must assume that the guest will use silent
> deflate
>
> - always supports silent deflate: does not need to do anything,
> current behavior works fine. But the driver might as well propose
> VIRTIO_BALLOON_F_WILL_TELL_HOST, so that migration works fine. (This
> is a hardware change, so it must be versioned, yadda yadda).
>
> I can prepare a spec patch for this.
>
>
> BTW, since we have in the archives an example of using silent deflate,
> here is an example of non-silent deflate. It may help understanding the
> above with an actual example of a device. Suppose a guest is using PCI
> passthrough, so it has all memory pinned.
>
> - If the guest will _not_ use silent deflate, we can unlock memory on
> inflate and lock it back on deflate. (The question is what to do if
> locking fail; left for when someone actually implements this thing).
>
> - If the guest will use silent deflate, we cannot do that.
>
> So this is the second case above. The device must propose
> VIRTIO_BALLOON_F_WILL_TELL_HOST. Then:
>
> - if the guest negotiates VIRTIO_BALLOON_F_SILENT_DEFLATE,
> we cannot do the munlock/mlock
>
> - if the guest negotiates VIRTIO_BALLOON_F_WILL_TELL_HOST,
> we can do the munlock/mlock
>
> - if the guest does not negotiate either, the driver is buggy
> and we cannot do the munlock/mlock
>
> Paolo


Let us start with what is broken currently. Looking at
it very closely, I think the answer is nothing.
Even migration in qemu is not broken as you claimed initially.

Next, consider the interface proposed here. You defacto declare
all existing drivers buggy. This is a wrong thing to do.
You also use two feature bits for a single simple thing,
this is inelegant.

Last, let us consider how existing feature can be used
in the hypervisor. If driver did not ack
MUST_TELL_HOST, it is *not* buggy but it means we can not
do munlock. This applies to current windows drivers.
If driver *did* ack MUST_TELL_HOST, we can munlock
and mlock back on leak.
Seems useful, driver support is already there,
so removing the MUST_TELL_HOST bit seems like a bad idea.

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/