Re: [patch V3 16/22] genirq/msi: Provide new domain id based interfaces for freeing interrupts

From: David Woodhouse
Date: Tue Jan 17 2023 - 03:23:56 EST


On Mon, 2023-01-16 at 20:49 +0100, Thomas Gleixner wrote:
> David!
>
> On Mon, Jan 16 2023 at 19:28, David Woodhouse wrote:
> > On Mon, 2023-01-16 at 20:22 +0100, Thomas Gleixner wrote:
> > > > Tested-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> > > >
> > > > Albeit only under qemu with
> > > > https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv
> > > > and not under real Xen.
> > >
> > > Five levels of emulation. What could possibly go wrong?
> >
> > It's the opposite — this is what happened when I threw my toys out of
> > the pram and said, "You're NOT doing that with nested virtualization!".
> >
> > One level of emulation. We host guests that think they're running on
> > Xen, directly in QEMU/KVM by handling the hypercalls and event
> > channels, grant tables, etc.
> >
> > We virtualised Xen itself :)
>
> Groan. Can we please agree on *one* hypervisor instead of growing
> emulators for all other hypervisors in each of them :)

Hey, we did work across KVM, Xen and even Hyper-V to make sure the
Extended Destination ID in MSI supports 32Ki vCPUs the *same* way on
each guest. Be thankful for small mercies!

And the code to support Xen guests natively in KVM is *fairly* minimal;
we allow userspace to catch hypercalls, and do a little bit of the fast
path of IRQ delivery because we really don't want to be bouncing out to
the userspace VMM for IPIs etc.

As for qemu, emulating environments that you may not have access to in
real hardware is its raison d'être, isn't it?

And agreeing on one hypervisor — that's what we're doing. But the
*administration* is the far more important part. We're allowing people
to standardise on KVM, and to focus on the administration and security
of only Linux and KVM.

But there are still huge numbers of of virtual machine images out there
which are configured to run on Xen. Their root disk is /dev/xvda, the
network device they have configured is vif0.

In some ways it's theoretically just as easy as telling all those folks
"well, you just need to install an NVMe driver and a new network card
driver". Except it isn't really, because that often ends up being
"rebuild it on a newer kernel and/or OS". And if the intern who set
this system up left three years ago and the company now depends on it
as critical infrastructure without really knowing it yet...

It isn't practical to tell people, "screw you, you can't run that any
more".

So we host them under Linux and they mostly look like native KVM guests
to the kernel, you stop breaking Xen guest mode, and everybody wins.

> > Now you have no more excuses for breaking Xen guest mode!
>
> No cookies, you spoilsport! :)

:)

Attachment: smime.p7s
Description: S/MIME cryptographic signature