Re: [RFC PATCH 00/17] virtual-bus

From: Gregory Haskins
Date: Wed Apr 01 2009 - 10:19:41 EST

Next message: Steven Rostedt: "[PATCH 2/2] ring-buffer: do not remove reader page from list on ring buffer free"
Previous message: Meelis Roos: "strange MTRR on 2.6.29-git"
In reply to: Andi Kleen: "Re: [RFC PATCH 00/17] virtual-bus"
Next in thread: Gregory Haskins: "Re: [RFC PATCH 00/17] virtual-bus"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Andi Kleen wrote:
> On Wed, Apr 01, 2009 at 08:03:49AM -0400, Gregory Haskins wrote:
>
>> Andi Kleen wrote:
>>
>>> Gregory Haskins <ghaskins@xxxxxxxxxx> writes:
>>>
>>> What might be useful is if you could expand a bit more on what the high level
>>> use cases for this.
>>>
>>> Questions that come to mind and that would be good to answer:
>>>
>>> This seems to be aimed at having multiple VMs talk
>>> to each other, but not talk to the rest of the world, correct?
>>> Is that a common use case?
>>>
>>>
>> Actually we didn't design specifically for either type of environment.
>>
>
> But surely you must have some specific use case in mind? Something
> that it does better than the various methods that are available
> today. Or rather there must be some problem you're trying
> to solve. I'm just not sure what that problem exactly is.
>
Performance. We are trying to create a high performance IO infrastructure.

Ideally we would like to see things like virtual-machines have
bare-metal performance (or as close as possible) using just pure
software on commodity hardware. The data I provided shows that
something like KVM with virtio-net does a good job on throughput even on
10GE, but the latency is several orders of magnitude slower than
bare-metal. We are addressing this issue and others like it that are a
result of the current design of out-of-kernel emulation.
>
>> What we *are* trying to address is making an easy way to declare virtual
>> resources directly in the kernel so that they can be accessed more
>> efficiently. Contrast that to the way its done today, where the models
>> live in, say, qemu userspace.
>>
>> So instead of having
>> guest->host->qemu::virtio-net->tap->[iptables|bridge], you simply have
>> guest->host->[iptables|bridge]. How you make your private network (if
>>
>
> So is the goal more performance or simplicity or what?
>

(Answered above)

>
>>> What would be the use cases for non networking devices?
>>>
>>> How would the interfaces to the user look like?
>>>
>>>
>> I am not sure if you are asking about the guests perspective or the
>> host-administators perspective.
>>
>
> I was wondering about the host-administrators perspective.
>
Ah, ok. Sorry about that. It was probably good to document that other
thing anyway, so no harm.

So about the host-administrator interface. The whole thing is driven by
configfs, and the basics are already covered in the documentation in
patch 2, so I wont repeat it here. Here is a reference to the file for
everyone's convenience:

http://git.kernel.org/?p=linux/kernel/git/ghaskins/vbus/linux-2.6.git;a=blob;f=Documentation/vbus.txt;h=e8a05dafaca2899d37bd4314fb0c7529c167ee0f;hb=f43949f7c340bf667e68af6e6a29552e62f59033

So a sufficiently privileged user can instantiate a new bus (e.g.
container) and devices on that bus via configfs operations. The types
of devices available to instantiate are dictated by whatever vbus-device
modules you have loaded into your particular kernel. The loaded modules
available are enumerated under /sys/vbus/deviceclass.

Now presumably the administrator knows what a particular module is and
how to configure it before instantiating it. Once they instantiate it,
it will present an interface in sysfs with a set of attributes. For
example, an instantiated venet-tap looks like this:

ghaskins@test:~> tree /sys/vbus/devices
/sys/vbus/devices
`-- foo
|-- class -> ../../deviceclass/venet-tap
|-- client_mac
|-- enabled
|-- host_mac
|-- ifname
`-- interfaces
`-- 0 -> ../../../instances/bar/devices/0

Some of these attributes, like "class" and "interfaces" are default
attributes that are filled in by the infrastructure. Other attributes,
like "client_mac" and "enabled" are properties defined by the venet-tap
module itself. So the administrator can then set these attributes as
desired to manipulate the configuration of the instance of the device,
on a per device basis.

So now imagine we have some kind of disk-io vbus device that is designed
to act kind of like a file-loopback device. It might define an
attribute allowing you to specify the path to the file/block-dev that
you want it to export.

(Warning: completely fictitious "tree" output to follow ;)

ghaskins@test:~> tree /sys/vbus/devices
/sys/vbus/devices
`-- foo
|-- class -> ../../deviceclass/vdisk
|-- src_path
`-- interfaces
`-- 0 -> ../../../instances/bar/devices/0

So the admin would instantiate this "vdisk" device and do:

'echo /path/to/my/exported/disk.dat > /sys/vbus/devices/foo/src_path'

To point the device to the file on the host that it wants to present as
a vdisk. Any guest that has access to the particular bus that contains
this device would then see it as a standard "vdisk" ABI device (as if
there where such a thing, yet) and could talk to it using a vdisk
specific driver.

A property of a vbus is that it is inherited by children. Today, I do
not have direct support in qemu for creating/configuring vbus devices.
Instead what I do is I set up the vbus and devices from bash, and then
launch qemu-kvm so it inherits the bus. Someday (soon, unless you guys
start telling me this whole idea is rubbish ;) I will add support so you
could do things like "-net nic,model=venet" and that would trigger qemu
to go out and create the container/device on its own. TBD.

I hope this helps to clarify!
-Greg

Attachment: signature.asc
Description: OpenPGP digital signature

Next message: Steven Rostedt: "[PATCH 2/2] ring-buffer: do not remove reader page from list on ring buffer free"
Previous message: Meelis Roos: "strange MTRR on 2.6.29-git"
In reply to: Andi Kleen: "Re: [RFC PATCH 00/17] virtual-bus"
Next in thread: Gregory Haskins: "Re: [RFC PATCH 00/17] virtual-bus"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]