Re: [RFC] Unify KVM kernel-space and user-space code into a singleproject

From: Avi Kivity
Date: Wed Mar 24 2010 - 09:58:22 EST


On 03/24/2010 03:46 PM, Joerg Roedel wrote:
On Wed, Mar 24, 2010 at 03:05:02PM +0200, Avi Kivity wrote:
On 03/24/2010 02:50 PM, Joerg Roedel wrote:
I don't want the tool for myself only. A typical perf user expects that
it works transparent.
A typical kvm user uses libvirt, so we can integrate it with that.
Someone who uses libvirt and virt-manager by default is probably not
interested in this feature at the same level a kvm developer is. And
developers tend not to use libvirt for low-level kvm development. A
number of developers have stated in this thread already that they would
appreciate a solution for guest enumeration that would not involve
libvirt.

So would I. But when I weigh the benefit of truly transparent system-wide perf integration for users who don't use libvirt but do use perf, versus the cost of transforming kvm from a single-process API to a system-wide API with all the complications that I've listed, it comes out in favour of not adding the API.

Those few users can probably script something to cover their needs.

Someone needs to know about the new guest to fetch its symbols. Or do
you want that part in the kernel too?
The samples will be tagged with the guest-name (and some additional
information perf needs). Perf userspace can access the symbols then
through /sys/kvm/guest0/fs/...

I take that as a yes? So we need a virtio-serial client in the kernel (which might be exploitable by a malicious guest if buggy) and a fs-over-virtio-serial client in the kernel (also exploitable).

Depends on how it is designed. A filesystem approach was already
mentioned. We could create /sys/kvm/ for example to expose information
about virtual machines to userspace. This would not require any new
security hooks.
Who would set the security context on those files?
An approach like: "The files are owned and only readable by the same
user that started the vm." might be a good start. So a user can measure
its own guests and root can measure all of them.

That's not how sVirt works. sVirt isolates a user's VMs from each other, so if a guest breaks into qemu it can't break into other guests owned by the same user.

The users who need this API (!libvirt and perf) probably don't care about sVirt, but a new API must not break it.

Plus, we need cgroup support so you can't see one container's guests
from an unrelated container.
cgroup support is an issue but we can solve that too. Its in general
still less complex than going through the whole libvirt-qemu-kvm stack.

It's a tradeoff. IMO, going through qemu is the better way, and also provides more information.

Integration with qemu would allow perf to tell us that the guest is
hitting the interrupt status register of a virtio-blk device in pci
slot 5 (the information is already available through the kvm_mmio
trace event, but only qemu can decode it).
Yeah that would be interesting information. But it is more related to
tracing than to pmu measurements.
The information which you mentioned above are probably better
captured by an extension of trace-events to userspace.

It's all related. You start with perf, see a problem with mmio, call up a histogram of mmio or interrupts or whatever, then zoom in on the misbehaving device.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/