Re: [patch 5/6] x86/fpu: Provide fpu_update_guest_xcr0/xfd()

From: Paolo Bonzini
Date: Wed Dec 15 2021 - 05:27:55 EST


On 12/15/21 11:09, Thomas Gleixner wrote:
Lets assume the restore order is XSTATE, XCR0, XFD:

XSTATE has everything in init state, which means the default
buffer is good enough

XCR0 has everything enabled including AMX, so the buffer is
expanded

XFD has AMX disable set, which means the buffer expansion was
pointless

If we go there, then we can just use a full expanded buffer for KVM
unconditionally and be done with it. That spares a lot of code.

If we decide to use a full expanded buffer as soon as KVM_SET_CPUID2 is done, that would work for me. Basically KVM_SET_CPUID2 would:

- check bits from CPUID[0xD] against the prctl requested with GUEST_PERM

- return with -ENXIO or whatever if any dynamic bits were not requested

- otherwise call fpstate_realloc if there are any dynamic bits requested

Considering that in practice all Linux guests with AMX would have XFD passthrough (because if there's no prctl, Linux keeps AMX disabled in XFD), this removes the need to do all the #NM handling too. Just make XFD passthrough if it can ever be set to a nonzero value. This costs an RDMSR per vmexit even if neither the host nor the guest ever use AMX.

That said, if we don't want to use a full expanded buffer, I don't expect any issue with requiring XFD first then XCR0 then XSAVE. As Juan said, QEMU first gets everything from the migration stream and then restores it. So yes, the QEMU code is complicated and messy but we can change the order without breaking migration from old to new QEMU. QEMU also forbids migration if there's any CPUID feature that it does not understand, so the old versions that don't understand QEMU won't migrate AMX (with no possibility to override).

Paolo