Re: [PATCH 0/7] KVM: Kernel-based Virtual Machine

From: Avi Kivity
Date: Sun Oct 22 2006 - 12:20:47 EST

Next message: Norbert Preining: "Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2"
Previous message: yhlu: "Re: [PATCH] x86-64: typo in __assign_irq_vector when updating pos for vector and offset"
In reply to: Arnd Bergmann: "Re: [PATCH 0/7] KVM: Kernel-based Virtual Machine"
Next in thread: Arnd Bergmann: "Re: [PATCH 0/7] KVM: Kernel-based Virtual Machine"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Arnd Bergmann wrote:

On Sunday 22 October 2006 10:37, Avi Kivity wrote:

I like this. Since we plan to support multiple vcpus per vm, the fs
structure might look like:

/kvm/my_vm
|
+----memory # mkdir to create memory slot.

Note that the way spufs does it, every directory is a reference-counted
object. Currently that includes single contexts and groups of
contexts that are supposed to be scheduled simultaneously.

The trick is that we use the special 'spu_create' syscall to
add a new object, while naming it, and return an open file
descriptor to it. When that file descriptor gets closed, the
object gets garbage-collected automatically.

Yes. Well, a single fd and ioctl()s do that as well.

We ended up adding a lot more file than we initially planned,
but the interface is really handy, especially if you want to
create some procps-like tools for it.

I don't really see the need. The cell dsps are a shared resource, while virtual machines are just another execution mode of an existing resource - the main cpu, which has a sharing mechanism (the scheduler and priorities).

| | # how to set size and offset?
| |
| +---0 # guest physical memory slot
| |
| +-- dirty_bitmap # read to get and atomically reset
| # the changed pages log

Have you thought about simply defining your guest to be a section
of the processes virtual address space? That way you could use
an anonymous mapping in the host as your guest address space, or
even use a file backed mapping in order to make the state persistant
over multiple runs. Or you could map the guest kernel into the
guest real address space with a private mapping and share the
text segment over multiple guests to save L2 and RAM.

I've thought of it but it can't work on i386 because guest physical address space is larger than virtual address space on i386. So we mmap("/dev/kvm") with file offsets corresponding to guest physical addresses.

I still like that idea, since it allows using hugetlbfs and allowing swapping. Perhaps we'll just accept the limitation that guests on i386 are limited.

|
|
+----cpu # mkdir/rmdir to create/remove vcpu
|

I'd recommend not allowing mkdir or similar operations, although
it's not that far off. One option would be to let the user specify
the number of CPUs at kvm_create() time, another option might
be to allow kvm_create with a special flag or yet another syscall
to create the vcpu objects.

Okay.

+----0
| |
| +--- irq # write to inject an irq
| |
| +--- regs # read/write to get/set registers
| |
| +--- debugger # write to set breakpoints/singlestep mode
|
+----1
[...]

It's certainly a lot more code though, and requires new syscalls. Since
this is a little esoteric does it warrant new syscalls?

We've gone through a number of iterations on the spufs design regarding this,
and in the end decided that the garbage-collecting property of spu_create
was superior to any other option, and adding the spu_run syscall was then
the logical step. BTW, one inspiration for spu_run came from sys_vm86, which
as you are probably aware of is already doing a lot of what you do, just
not for protected mode guests.

Yes, we're doing a sort of vmx86_64().

Thanks for the ideas, I'm certainly leaning towards a filesystem based approach and I'll also reconsider the mapping (mmap() vi virtual address space subsection).

--
error compiling committee.c: too many arguments to function

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Norbert Preining: "Re: tg3 kernel bug in 2.6.18-mm3 and 2.6.19-rc2-mm2"
Previous message: yhlu: "Re: [PATCH] x86-64: typo in __assign_irq_vector when updating pos for vector and offset"
In reply to: Arnd Bergmann: "Re: [PATCH 0/7] KVM: Kernel-based Virtual Machine"
Next in thread: Arnd Bergmann: "Re: [PATCH 0/7] KVM: Kernel-based Virtual Machine"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]