Re: [RFC PATCH 1/2] KVM: x86: Add a new system attribute for dynamic XSTATE component

From: Chang S. Bae
Date: Wed Aug 24 2022 - 18:49:26 EST


On 8/24/2022 2:42 PM, Sean Christopherson wrote:
On Tue, Aug 23, 2022, Chang S. Bae wrote:
== Background ==

A set of architecture-specific prctl() options offer to control dynamic
XSTATE components in VCPUs. Userspace VMMs may interact with the host using
ARCH_GET_XCOMP_GUEST_PERM and ARCH_REQ_XCOMP_GUEST_PERM.

However, they are separated from the KVM API. KVM may select features that
the host supports and advertise them through the KVM_X86_XCOMP_GUEST_SUPP
attribute.

== Problem ==

QEMU [1] queries the features through the KVM API instead of using the x86
arch_prctl() option. But it still needs to use arch_prctl() to request the
permission. Then this step may become fragile because it does not guarantee
to comply with the KVM policy.

But backdooring through KVM doesn't prevent usersepace from walking in through
the front door (arch_prctl()), i.e. this doesn't protect the kernel in any way.

No, I don't think backdooring is established in this proposal. The body of the arch_prctl() support is encapsulated inside of the x86 core code. KVM is simply calling it like arch_prctl() does.

KVM needs to ensure that _KVM_ doesn't screw up and let userspace use features
that KVM doesn't support. The kernel's restrictions on using features goes on
top, i.e. KVM must behave correctly irrespective of kernel restrictions.

Maybe this is a policy decision. I don't think that ARCH_REQ_XCOMP_GUEST_PERM goes away with this. Userspace may still use the arch_prctl() set. But then it makes more sense and consistent to use ARCH_GET_XCOMP_SUPP in first place, instead of KVM_X86_XCOMP_GUEST_SUPP, no?

If QEMU wants to assert that it didn't misconfigure itself, it can assert on the
config in any number of ways, e.g. assert that ARCH_GET_XCOMP_GUEST_PERM is a
subset of KVM_X86_XCOMP_GUEST_SUPP at the end of kvm_request_xsave_components().

Yes, but I guess the new attribute can make it simple.

Thanks,
Chang