Re: [PATCH RFC 0/4] 5-level EPT

From: Yu Zhang
Date: Fri Mar 10 2017 - 03:07:32 EST




On 3/9/2017 10:16 PM, Paolo Bonzini wrote:

On 17/01/2017 03:18, Li, Liang Z wrote:
On 29/12/2016 10:25, Liang Li wrote:
x86-64 is currently limited physical address width to 46 bits, which
can support 64 TiB of memory. Some vendors require to support more for
some use case. Intel plans to extend the physical address width to
52 bits in some of the future products.

The current EPT implementation only supports 4 level page table, which
can support maximum 48 bits physical address width, so it's needed to
extend the EPT to 5 level to support 52 bits physical address width.

This patchset has been tested in the SIMICS environment for 5 level
paging guest, which was patched with Kirill's patchset for enabling
5 level page table, with both the EPT and shadow page support. I just
covered the booting process, the guest can boot successfully.

Some parts of this patchset can be improved. Any comments on the
design or the patches would be appreciated.
I will review the patches. They seem fairly straightforward.

However, I am worried about the design of the 5-level page table feature
with respect to migration.

Processors that support the new LA57 mode can write 57-canonical/48-
noncanonical linear addresses to some registers even when LA57 mode is
inactive. This is true even of unprivileged instructions, in particular
WRFSBASE/WRGSBASE.

This is fairly bad because, if a guest performs such a write (because of a bug
or because of malice), it will not be possible to migrate the virtual machine to
a machine that lacks LA57 mode.

Ordinarily, hypervisors trap CPUID to hide features that are only present in
some processors of a heterogeneous cluster, and the hypervisor also traps
for example CR4 writes to prevent enabling features that were masked away.
In this case, however, the only way for the hypervisor to prevent the write
would be to run the guest with
CR4.FSGSBASE=0 and trap all executions of WRFSBASE/WRGSBASE. This
might have negative effects on performance for workloads that use the
instructions.

Of course, this is a problem even without your patches. However, I think it
should be addressed first. I am seriously thinking of blacklisting FSGSBASE
completely on LA57 machines until the above is fixed in hardware.

Paolo
The issue has already been forwarded to the hardware guys, still waiting for the feedback.
Going to review this now. Any news?

Thanks for your reivew, Paolo.
This is Yu Zhang from Intel. I'll pick up this 5 level ept feature, and will try to address your comments next. :-)
Now I am learning Liang's code and trying to bring VM up with Kirill's native 5 level paging code integrated.

Yu
Paolo