Re: [RFC PATCH 3/3] kvm: Add VFIO device for handling IOMMU cachecoherency

From: Alexey Kardashevskiy
Date: Fri Sep 13 2013 - 04:49:24 EST


On 09/13/2013 07:23 AM, Alex Williamson wrote:
> So far we've succeeded at making KVM and VFIO mostly unaware of each
> other, but there's any important point where that breaks down. Intel
> VT-d hardware may or may not support snoop control. When snoop
> control is available, intel-iommu promotes No-Snoop transactions on
> PCIe to be cache coherent. That allows KVM to handle things like the
> x86 WBINVD opcode as a nop. When the hardware does not support this,
> KVM must implement a hardware visible WBINVD for the guest.
>
> We could simply let userspace tell KVM how to handle WBINVD, but it's
> privileged for a reason. Allowing an arbitrary user to enable
> physical WBINVD gives them a more access to the hardware. Previously,
> this has only been enabled for guests supporting legacy PCI device
> assignment. In such cases it's necessary for proper guest execution.
> We therefore create a new KVM-VFIO virtual device. The user can add
> and remove VFIO groups to this device via file descriptors. KVM
> makes use of the VFIO external user interface to validate that the
> user has access to physical hardware and gets the coherency state of
> the IOMMU from VFIO. This provides equivalent functionality to
> legacy KVM assignment, while keeping (nearly) all the bits isolated.
>
> The one intrusion is the resulting flag indicating the coherency
> state. For this RFC it's placed on the x86 kvm_arch struct, however
> I know POWER has interest in using the VFIO external user interface,
> and I'm hoping we can share a common KVM-VFIO device. Perhaps they
> care about No-Snoop handling as well or the code can be #ifdef'd.


POWER does not support (at least boos3s - "server", not sure about others)
this cache-non-coherent stuff at all.

Regarding reusing this device with external API for POWER - I posted a
patch which introduces KVM device to link KVM with IOMMU but besides the
list of groups registered in KVM, it also provides the way to find a group
by LIOBN (logical bus number) which is used in DMA map/unmap hypercalls. So
in my case kvm_vfio_group struct needs LIOBN and it would be nice to have
there window_size too (for a quick boundary check). I am not sure we want
to mix everything here.

It is in "[PATCH v10 12/13] KVM: PPC: Add support for IOMMU in-kernel
handling" if you are interested (kvmppc_spapr_tce_iommu_device).



--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/