Re: [RFC] Unify KVM kernel-space and user-space code into a singleproject

From: Avi Kivity
Date: Wed Mar 24 2010 - 09:05:35 EST


On 03/24/2010 02:50 PM, Joerg Roedel wrote:

You can always provide the kernel and module paths as command line
parameters. It just won't be transparently usable, but if you're using
qemu from the command line, presumably you can live with that.
I don't want the tool for myself only. A typical perf user expects that
it works transparent.

A typical kvm user uses libvirt, so we can integrate it with that.

Could be easily done using notifier chains already in the kernel.
Probably implemented with much less than 100 lines of additional code.
And a userspace interface for that.
Not necessarily. The perf event is configured to measure systemwide kvm
by userspace. The kernel side of perf takes care that it stays
system-wide even with added vm instances. So in this case the consumer
for the notifier would be the perf kernel part. No userspace interface
required.

Someone needs to know about the new guest to fetch its symbols. Or do you want that part in the kernel too?

If we make an API, I'd like it to be generally useful.
Thats hard to do at this point since we don't know what people will use
it for. We should keep it simple in the beginning and add new features
as they are requested and make sense in this context.

IMO this use case is to rare to warrant its own API, especially as there are alternatives.

It's a total headache. For example, we'd need security module hooks to
determine access permissions. So far we managed to avoid that since kvm
doesn't allow you to access any information beyond what you provided it
directly.
Depends on how it is designed. A filesystem approach was already
mentioned. We could create /sys/kvm/ for example to expose information
about virtual machines to userspace. This would not require any new
security hooks.

Who would set the security context on those files? Plus, we need cgroup support so you can't see one container's guests from an unrelated container.

Copying the objects is a one time cost. If you run perf for more than a
second or two, it would fetch and cache all of the data. It's really
the same problem with non-guest profiling, only magnified a bit.
I don't think we can cache filesystem data of a running guest on the
host. It is too hard to keep such a cache coherent.

I don't see any choice. The guest can change its symbols at any time (say by kexec), without any notification.

Other userspaces can also provide this functionality, like they have to
provide disk, network, and display emulation. The kernel is not a huge
library.
If two userspaces run in parallel what is the single instance where perf
can get a list of guests from?

I don't know. Surely that's solvable though.

kvm.ko has only a small subset of the information that is used to define
a guest.
The subset is not small. It contains all guest vcpus, the complete
interrupt routing hardware emulation and manages event the guests
memory.

It doesn't contain most of the mmio and pio address space. Integration with qemu would allow perf to tell us that the guest is hitting the interrupt status register of a virtio-blk device in pci slot 5 (the information is already available through the kvm_mmio trace event, but only qemu can decode it).

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/