Re: [KVM PATCH] KVM: introduce "xinterface" API for external interactionwith guests

From: Gregory Haskins
Date: Thu Jul 16 2009 - 14:23:14 EST


Arnd Bergmann wrote:
> On Thursday 16 July 2009, Gregory Haskins wrote:
>
>> Background: The original vbus code was tightly integrated with kvm.ko. Avi
>> suggested that we abstract the interfaces such that it could live outside
>> of kvm.
>>
>
> The code is still highly kvm-specific, you would not be able to use
> it with another hypervisor like lguest or vmware player, right?
>

In its current form, it is kvm specific primarily because it was not a
explicit design constraint of mine to support others. However, there is
no reason why we could not generalize the interface if that is a
desirable trait. Ultimately I would like to have my project support
other things like lguest, so this is not a bad idea. Otherwise, someone
will invariably be proposing an "lguest_xinterface" next ;)

>
>> Example usage: QEMU instantiates a guest, and an external module "foo"
>> that desires the ability to interface with the guest (say via
>> open("/dev/foo")). QEMU may then issue a KVM_GET_VMID operation to acquire
>> the u64-based vmid, and pass it to ioctl(foofd, FOO_SET_VMID, &vmid).
>> Upon receipt, the foo module can issue kvm_xinterface_find(vmid) to acquire
>> the proper context. Internally, the struct kvm* and associated
>> struct module* will remain pinned at least until the foo module calls
>> kvm_xinterface_put().
>>
>
> Your approach allows passing the vmid from a process that does
> not own the kvm context. This looks like an intentional feature,
> but I can't see what this gains us.

This work is towards the implementation of lockless-shared-memory
subsystems, which includes ring constructs such as virtio-ring,
VJ-netchannels, and vbus-ioq. I find that these designs perform
optimally when you allow two distinct contexts (producer + consumer) to
process the ring concurrently, which implies a disparate context from
the guest in question. Note that the infrastructure we are discussing
does not impose a requirement for the contexts to be unique: it will
work equally well from the same or a different process.

For an example of this "producer/consumer" dynamic over shared memory in
action, please refer to my previous posting re: "vbus"

http://lkml.org/lkml/2009/4/21/408

I am working on v4 now, and this patch is part of the required support.

>
>
>
>> As a final measure, we link the xinterface code statically
>> into the kernel so that callers are guaranteed a stable interface to
>> kvm_xinterface_find() without implicitly pinning kvm.ko or racing against
>> it.
>>
>
> I also don't understand this. Are you worried about driver modules
> breaking when an externally-compiled kvm.ko is loaded? The same could
> be achieved by defining your data structures kvm_xinterface_ops and
> kvm_xinterface in a kernel header that is not shipped by kvm-kmod but
> always taken from the kernel headers.
> It does not matter if the entry points are build into the kernel or
> exported from a kvm.ko as long as you define a fixed ABI.
>
> What is the problem with pinning kvm.ko from another module using
> its features?
>

Well, there is always the chance that I am doing something dumb or
missing the point ;)

But my rationale was as follows: The problem is that kvm is a little
weird in the module ref department: If I were to do it the standard way
and link xinterface.o into kvm.o (and have any xinterface_find() users
do a tristate+"depends on KVM"), this would work as I believe you are
suggesting. That is to say: whenever I loaded "foo.ko", insmod would
automatically up the reference of kvm.ko. The issue is that is not
quite what I really want ;)

I want to hold the reference to the entire .text chain, which includes
kvm.ko + [kvm-intel.ko | kvm-amd.ko]. If you look carefully, the
ops->owner that is assigned is actually the arch.ko. In addition, I
wanted the kvm.ko lifetime to be associated with the lifetime of its
contexts actually in use, not the lifetime of its installed
dependencies. Therefore, I did it this way.

But to your point, I suppose the dependency lifetime thing is not a huge
deal. I could therefore modify the patch to simply link xinterface.o
into kvm.ko and still achieve the primary objective by retaining ops->owner.

Note that if we are going to generalize the interface to support other
guests as you may have been suggesting above, it should probably stay
statically linked (and perhaps live in ./lib or something)


> Can't you simply provide a function call to lookup the kvm context
> pointer from the file descriptor to achieve the same functionality?
>
You mean so have: struct kvm_xinterface *kvm_xinterface_find(int fd)

(instead of creating our own vmid namespace) ?

Or are you suggesting using fget() instead of kvm_xinterface_find()?

> To take that thought further, maybe the dependency can be turned
> around: If every user (pci-uio, virtio-net, ...) exposes a file
> descriptor based interface to user space, you can have a kvm
> ioctl to register the object behind that file descriptor with
> an existing kvm context to associate it with a guest.

FWIW: We do that already for the signaling path (see irqfd and ioeventfd
in kvm.git). Each side exposes interfaces that accept eventfds, and the
fds are passed around that way.

However, for the functions we are talking about now, I don't think it
really works well to go the other way. I could be misunderstanding what
you mean, though. What I mean is that it's KVM that is providing a
service to the other modules (in this case, translating memory
pointers), so what would an inverse interface look like for that? And
even if you came up with one, it seems to me that its just "6 of one,
half-dozen of the other" kind of thing.

> That would
> nicely solve the life time questions by pinning the external
> object for the life time of the kvm context

I suppose that is nice, but note that in practice both objects (the
kvm-guest and the "foo" module) are managed by the same entity (i.e.
QEMU) and therefore share the same approximate lifetime.

Kind Regards,
-Greg

Attachment: signature.asc
Description: OpenPGP digital signature