Re: [PATCH 1/7] KVM: x86: Retry page fault if MMU reload is pending and root has no sp

From: Sean Christopherson
Date: Fri Dec 10 2021 - 12:16:01 EST


On Fri, Dec 10, 2021, Paolo Bonzini wrote:
> On 12/10/21 17:01, Sean Christopherson wrote:
> > > KVM_REQ_MMU_RELOAD is raised after kvm->arch.mmu_valid_gen is fixed (of
> > > course, otherwise the other CPU might just not see any obsoleted page
> > > from the legacy MMU), therefore any check on KVM_REQ_MMU_RELOAD is just
> > > advisory.
> >
> > I disagree. IMO, KVM should not be installing SPTEs into obsolete shadow pages,
> > which is what continuing on allows. I don't _think_ it's problematic, but I do
> > think it's wrong.
> >
> > [...] Eh, for all intents and purposes, KVM_REQ_MMU_RELOAD very much says
> > special roots are obsolete. The root will be unloaded, i.e. will no
> > longer be used, i.e. is obsolete.
>
> I understand that---but it takes some unspoken details to understand that.

Eh, it takes just as many unspoken details to understand why it's safe-ish to
install SPTEs into an obsolete shadow page.

> In particular that both kvm_reload_remote_mmus and is_page_fault_stale are
> called under mmu_lock write-lock, and that there's no unlock between
> updating mmu_valid_gen and calling kvm_reload_remote_mmus.
>
> (This also suggests, for the other six patches, keeping
> kvm_reload_remote_mmus and just moving it to arch/x86/kvm/mmu/mmu.c, with an
> assertion that the MMU lock is held for write).
>
> But since we have a way forward for having no special roots to worry about,
> it seems an unnecessary overload for 1) a patch that will last one or two
> releasees at most

Yeah, I don't disagree, which is why I'm not totally opposed to punting this and
naturally fixing it by allocating shadow pages for the special roots. But this
code needs to be modified by Jiangshan's series either way, so it's not like we're
saving anything meaningful.

> 2) a case that has been handled in the inefficient way forever.

I don't care about inefficiency, I'm worried about correctness. It's extremely
unlikely this fixes a true bug in the legacy MMU, but there's also no real
downside to adding the check.

Anyways, either way is fine.