Re: [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion

From: Sean Christopherson
Date: Wed May 15 2024 - 18:47:30 EST

Next message: Theodore Ts'o: "Re: KASAN: use-after-free in ext4_find_extent in v6.9"
Previous message: Dave Airlie: "Re: [git pull] drm for 6.10-rc1"
In reply to: Edgecombe, Rick P: "Re: [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion"
Next in thread: Huang, Kai: "Re: [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Wed, May 15, 2024, Rick P Edgecombe wrote:
> On Wed, 2024-05-15 at 13:05 -0700, Sean Christopherson wrote:
> > On Wed, May 15, 2024, Rick P Edgecombe wrote:
> > > So rather then try to optimize zapping more someday and hit similar
> > > issues, let userspace decide how it wants it to be done. I'm not sure of
> > > the actual performance tradeoffs here, to be clear.
> >
> > ...unless someone is able to root cause the VFIO regression, we don't have
> > the luxury of letting userspace give KVM a hint as to whether it might be
> > better to do a precise zap versus a nuke-and-pave.
>
> Pedantry... I think it's not a regression if something requires a new flag. It
> is still a bug though.

Heh, pedantry denied. I was speaking in the past tense about the VFIO failure,
which was a regression as I changed KVM behavior without adding a flag.

> The thing I worry about on the bug is whether it might have been due to a guest
> having access to page it shouldn't have. In which case we can't give the user
> the opportunity to create it.
>
> I didn't gather there was any proof of this. Did you have any hunch either way?

I doubt the guest was able to access memory it shouldn't have been able to access.
But that's a moot point, as the bigger problem is that, because we have no idea
what's at fault, KVM can't make any guarantees about the safety of such a flag.

TDX is a special case where we don't have a better option (we do have other options,
they're just horrible). In other words, the choice is essentially to either:

(a) cross our fingers and hope that the problem is limited to shared memory
with QEMU+VFIO, i.e. and doesn't affect TDX private memory.

or

(b) don't merge TDX until the original regression is fully resolved.

FWIW, I would love to root cause and fix the failure, but I don't know how feasible
that is at this point.

> > And more importantly, it would be a _hint_, not the hard requirement that TDX
> > needs.
> >
> > > That said, a per-vm know is easier for TDX purposes.
>
> If we don't want it to be a mandate from userspace, then we need to do some per-
> vm checking in TDX's case anyway. In which case we might as well go with the
> per-vm option for TDX.
>
> You had said up the thread, why not opt all non-normal VMs into the new
> behavior. It will work great for TDX. But why do SEV and others want this
> automatically?

Because I want flexibility in KVM, i.e. I want to take the opportunity to try and
break away from KVM's godawful ABI. It might be a pipe dream, as keying off the
VM type obviously has similar risks to giving userspace a memslot flag. The one
sliver of hope is that the VM types really are quite new (though less so for SEV
and SEV-ES), whereas a memslot flag would be easily applied to existing VMs.

Next message: Theodore Ts'o: "Re: KASAN: use-after-free in ext4_find_extent in v6.9"
Previous message: Dave Airlie: "Re: [git pull] drm for 6.10-rc1"
In reply to: Edgecombe, Rick P: "Re: [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion"
Next in thread: Huang, Kai: "Re: [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]