On Wed, Jan 05, 2022, Lai Jiangshan wrote:
On Wed, Jan 5, 2022 at 5:54 AM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
default_pae_pdpte is needed because the cpu expect PAE pdptes are
present when VMenter.
That's incorrect. Neither Intel nor AMD require PDPTEs to be present. Not present
is perfectly ok, present with reserved bits is what's not allowed.
Intel SDM:
A VM entry that checks the validity of the PDPTEs uses the same checks that are
used when CR3 is loaded with MOV to CR3 when PAE paging is in use[7]. If MOV to CR3
would cause a general-protection exception due to the PDPTEs that would be loaded
(e.g., because a reserved bit is set), the VM entry fails.
7. This implies that (1) bits 11:9 in each PDPTE are ignored; and (2) if bit 0
(present) is clear in one of the PDPTEs, bits 63:1 of that PDPTE are ignored.
But in practice, the VM entry fails if the present bit is not set in the
PDPTE for the linear address being accessed (when EPT enabled at least). The
host kvm complains and dumps the vmcs state.
That doesn't make any sense. If EPT is enabled, KVM should never use a pae_root.
The vmcs.GUEST_PDPTRn fields are in play, but those shouldn't derive from KVM's
shadow page tables.
And I doubt there is a VMX ucode bug at play, as KVM currently uses '0' in its
shadow page tables for not-present PDPTEs.
If you can post/provide the patches that lead to VM-Fail, I'd be happy to help
debug.