Re: [kvm-devel] [PATCH 3/3] virtio PCI device

From: Anthony Liguori
Date: Mon Nov 26 2007 - 14:18:43 EST


Avi Kivity wrote:
rx and tx are closely related. You rarely have one without the other.

In fact, a turned implementation should have zero kicks or interrupts for bulk transfers. The rx interrupt on the host will process new tx descriptors and fill the guest's rx queue; the guest's transmit function can also check the receive queue. I don't know if that's achievable for Linuz guests currently, but we should aim to make it possible.

ATM, the net driver does a pretty good job of disabling kicks/interrupts unless they are needed. Checking for rx on tx and vice versa is a good idea and could further help there. I'll give it a try this week.

Another point is that virtio still has a lot of leading zeros in its mileage counter. We need to keep things flexible and learn from others as much as possible, especially when talking about the ABI.

Yes, after thinking about it over holiday, I agree that we should at least introduce a virtio-pci feature bitmask. I'm not inclined to attempt to define a hypercall ABI or anything like that right now but having the feature bitmask will at least make it possible to do such a thing in the future.

I'm wary of introducing the notion of hypercalls to this device because it makes the device VMM specific. Maybe we could have the device provide an option ROM that was treated as the device "BIOS" that we could use for kicking and interrupt acking? Any idea of how that would map to Windows? Are there real PCI devices that use the option ROM space to provide what's essentially firmware? Unfortunately, I don't think an option ROM BIOS would map well to other architectures.


The BIOS wouldn't work even on x86 because it isn't mapped to the guest address space (at least not consistently), and doesn't know the guest's programming model (16, 32, or 64-bits? segmented or flat?)

Xen uses a hypercall page to abstract these details out. However, I'm not proposing that. Simply indicate that we support hypercalls, and use some layer below to actually send them. It is the responsibility of this layer to detect if hypercalls are present and how to call them.

Hey, I think the best place for it is in paravirt_ops. We can even patch the hypercall instruction inline, and the driver doesn't need to know about it.

Yes, paravirt_ops is attractive for abstracting the hypercall calling mechanism but it's still necessary to figure out how hypercalls would be identified. I think it would be necessary to define a virtio specific hypercall space and use the virtio device ID to claim subspaces.

For instance, the hypercall number could be (virtio_devid << 16) | (call number). How that translates into a hypercall would then be part of the paravirt_ops abstraction. In KVM, we may have a single virtio hypercall where we pass the virtio hypercall number as one of the arguments or something like that.

Not much of an argument, I know.


wrt. number of queues, 8 queues will consume 32 bytes of pci space if all you store is the ring pfn.
You also at least need a num argument which takes you to 48 or 64 depending on whether you care about strange formatting. 8 queues may not be enough either. Eric and I have discussed whether the 9p virtio device should support multiple mounts per-virtio device and if so, whether each one should have it's own queue. Any devices that supports this sort of multiplexing will very quickly start using a lot of queues.
Make it appear as a pci function? (though my feeling is that multiple mounts should be different devices; we can then hotplug mountpoints).

We may run out of PCI slots though :-/

Then we can start selling virtio extension chassis.

:-) Do you know if there is a hard limit on the number of devices on a PCI bus? My concern was that it was limited by something stupid like an 8-bit identifier.

Regards,

Anthony Liguori

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/