Re: [RFC] Unify KVM kernel-space and user-space code into a singleproject

From: Avi Kivity
Date: Wed Mar 24 2010 - 08:09:00 EST


On 03/24/2010 01:59 PM, Joerg Roedel wrote:
On Wed, Mar 24, 2010 at 06:57:47AM +0200, Avi Kivity wrote:
On 03/23/2010 08:21 PM, Joerg Roedel wrote:
This enumeration is a very small and non-intrusive feature. Making it
aware of namespaces is easy too.

It's easier (and safer and all the other boring bits) not to do it at
all in the kernel.
For the KVM stack is doesn't matter where it is implemented. It is as
easy in qemu or libvirt as in the kernel. I also don't see big risks. On
the perf side and for its users it is a lot easier to have this in the
kernel.
I for example always use plain qemu when running kvm guests and never
used libvirt. The only central entity I have here is the kvm kernel
modules. I don't want to start using it only to be able to use perf kvm.

You can always provide the kernel and module paths as command line parameters. It just won't be transparently usable, but if you're using qemu from the command line, presumably you can live with that.

Who would be the consumer of such notifications? A 'perf kvm list' can
live without I guess. If we need them later we can still add them.
System-wide monitoring needs to work equally well for guests started
before or after the monitor.
Could be easily done using notifier chains already in the kernel.
Probably implemented with much less than 100 lines of additional code.

And a userspace interface for that.

Even disregarding that, if you introduce an API, people will start
using it and complaining if it's incomplete.
There is nothing wrong with that. We only need to define what this API
should be used for to prevent rank growth. It could be an
instrumentation-only API for example.

If we make an API, I'd like it to be generally useful.

It's a total headache. For example, we'd need security module hooks to determine access permissions. So far we managed to avoid that since kvm doesn't allow you to access any information beyond what you provided it directly.


My statement was not limited to enumeration, I should have been more
clear about that. The guest filesystem access-channel is another
affected part. The 'perf kvm top' command will access the guest
filesystem regularly and going over qemu would be more overhead here.

Why? Also, the real cost would be accessing the filesystem, not copying
data over qemu.
When measuring cache-misses any additional (and in this case
unnecessary) copy-overhead result in less appropriate results.

Copying the objects is a one time cost. If you run perf for more than a second or two, it would fetch and cache all of the data. It's really the same problem with non-guest profiling, only magnified a bit.

Providing this in the KVM module directly also has the benefit that it
would work out-of-the-box with different userspaces too. Or do we want
to limit 'perf kvm' to the libvirt-qemu-kvm software stack?
Other userspaces can also provide this functionality, like they have to
provide disk, network, and display emulation. The kernel is not a huge
library.
This has nothing to do with a library. It is about entity and resource
management which is what os kernels are about. The virtual machine is
the entity (similar to a process) and we want to add additional access
channels and names to it.

kvm.ko has only a small subset of the information that is used to define a guest.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/