[RFC PATCH 0/6] KVM: X86: Add and use shadow page with level promoted or acting as pae_root

From: Lai Jiangshan
Date: Fri Dec 10 2021 - 04:24:57 EST


From: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx>

(Request For Help for testing on AMD machine with 32 bit L1 hypervisor,
see information below)

KVM handles root pages specially for these cases:

direct mmu (nonpaping for 32 bit guest):
gCR0_PG=0
shadow mmu (shadow paping for 32 bit guest):
gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=0
gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=1
direct mmu (NPT for 32bit host):
hEFER_LMA=0
shadow nested NPT (for 32bit L1 hypervisor):
gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=0,hEFER_LMA=0
gCR0_PG=1,gEFER_LMA=0,gCR4_PSE=1,hEFER_LMA=0
gCR0_PG=1,gEFER_LMA=0,gCR4_PSE={0|1},hEFER_LMA=1,hCR4_LA57={0|1}
Shadow nested NPT for 64bit L1 hypervisor:
gEFER_LMA=1,gCR4_LA57=0,hEFER_LMA=1,hCR4_LA57=1

They are either using special roots or matched the condition
((mmu->shadow_root_level > mmu->root_level) && !mm->direct_map)
(refered as level promotion) or both.

All the cases are using special roots except the last one.
Many cases are doing level promotion including the last one.

When special roots are used, the root page will not be backed by
kvm_mmu_page. So they must be treated specially, but not all places
is considering this problem, and Sean is adding some code to check
this special roots.

When level promotion, the kvm treats them silently always.

These treaments incur problems or complication, see the changelog
of every patch.

These patches were made when I reviewed all the usage of shadow_root_level
and root_level. Some of them are sent and accepted. Patch3-6 are too
complicated so they had been held back. Patch1 and patch2 were sent.
Patch1 was rejected, but I think it is good. Patch2 is said to be
accepted, but it is not shown in the kvm/queue. Patch3-6 conflicts
with patch1,2 so patch1,2 are included here too.

Other reason that patch 3-6 were held back is that the patch 3-6 are
not tested with shadow NPT cases listed above. Because I don't have
guest images can act as 32 bit L1 hypervisor, nor I can access to
AMD machine with 5 level paging. I'm a bit reluctant to ask for the
resource, so I send the patches and wish someone test them and modify
them. At least, it provides some thinking and reveals problems of the
existing code and of the AMD cases.
( *Request For Help* here.)

These patches have been tested with the all cases except the shadow-NPT
cases, the code coverage is believed to be more than 95% (hundreds of
code related to shadow-NPT are shoved, and be replaced with common
role.pae_root and role.level_promoted code with only 8 line of code is
added for shadow-NPT, only 2 line of code is not covered in my tests).

And Sean also found the problem of the last case listed above and asked
questions in a reply[1] to one of my emails, I hope this patchset can
be my reply to his questions about such complicated case.

If special roots are removed and PAE page is write-protected, there
can be some more cleanups.

[1]: https://lore.kernel.org/lkml/YbFY533IT3XSIqAK@xxxxxxxxxx/

Lai Jiangshan (6):
KVM: X86: Check root_level only in fast_pgd_switch()
KVM: X86: Walk shadow page starting with shadow_root_level
KVM: X86: Add arguement gfn and role to kvm_mmu_alloc_page()
KVM: X86: Introduce role.level_promoted
KVM: X86: Alloc pae_root shadow page
KVM: X86: Use level_promoted and pae_root shadow page for 32bit guests

arch/x86/include/asm/kvm_host.h | 9 +-
arch/x86/kvm/mmu/mmu.c | 440 ++++++++++----------------------
arch/x86/kvm/mmu/mmu_audit.c | 26 +-
arch/x86/kvm/mmu/paging_tmpl.h | 15 +-
arch/x86/kvm/mmu/tdp_mmu.h | 7 +-
5 files changed, 164 insertions(+), 333 deletions(-)

--
2.19.1.6.gb485710b