Re: [PATCH] x86/kvm/mmu: fix switch between root and guest MMUs

From: Vitaly Kuznetsov
Date: Fri Feb 22 2019 - 13:54:35 EST


Paolo Bonzini <pbonzini@xxxxxxxxxx> writes:

> On 22/02/19 17:45, Vitaly Kuznetsov wrote:
>> Commit 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu") brought one subtle
>> change: previously, when switching back from L2 to L1, we were resetting
>> MMU hooks (like mmu->get_cr3()) in kvm_init_mmu() called from
>> nested_vmx_load_cr3() and now we do that in nested_ept_uninit_mmu_context()
>> when we re-target vcpu->arch.mmu pointer.
>> The change itself looks logical: if nested_ept_init_mmu_context() changes
>> something than nested_ept_uninit_mmu_context() restores it back. There is,
>> however, one thing: the following call chain:
>>
>> nested_vmx_load_cr3()
>> kvm_mmu_new_cr3()
>> __kvm_mmu_new_cr3()
>> fast_cr3_switch()
>> cached_root_available()
>>
>> now happens with MMU hooks pointing to the new MMU (root MMU in our case)
>> while previously it was happening with the old one. cached_root_available()
>> tries to stash current root but it is incorrect to read current CR3 with
>> mmu->get_cr3(), we need to use old_mmu->get_cr3() which in case we're
>> switching from L2 to L1 is guest_mmu. (BTW, in shadow page tables case this
>> is a non-issue because we don't switch MMU).
>>
>> While we could've tried to guess that we're switching between MMUs and call
>> the right ->get_cr3() from cached_root_available() this seems to be overly
>> complicated. Instead, just stash the corresponding CR3 when setting
>> root_hpa and make cached_root_available() use the stashed value.
>>
>> Fixes: 14c07ad89f4d ("x86/kvm/mmu: introduce guest_mmu")
>> Signed-off-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
>
> Is the bug latent until the other patch? Or are they completely
> separate issues?
>

This one was reported as https://bugs.launchpad.net/qemu/+bug/1813165
(and I still don't completely understand why it is only SMM which is
broken - or is it?). While working on it I noticed that
fast_cr3_switch() actually doesn't work (even after this patch when we
stop putting incorrect values in the cache with
cached_root_available()). So in the end they seem to be fairly
independent.

--
Vitaly