Re: [PATCH Part2 RFC v2 10/37] x86/fault: Add support to handle the RMP fault for kernel address

From: Brijesh Singh
Date: Mon May 03 2021 - 11:49:45 EST



On 5/3/21 10:03 AM, Andy Lutomirski wrote:
> On Mon, May 3, 2021 at 7:44 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>> On 4/30/21 5:37 AM, Brijesh Singh wrote:
>>> When SEV-SNP is enabled globally, a write from the host goes through the
>>> RMP check. When the host writes to pages, hardware checks the following
>>> conditions at the end of page walk:
>>>
>>> 1. Assigned bit in the RMP table is zero (i.e page is shared).
>>> 2. If the page table entry that gives the sPA indicates that the target
>>> page size is a large page, then all RMP entries for the 4KB
>>> constituting pages of the target must have the assigned bit 0.
>>> 3. Immutable bit in the RMP table is not zero.
>>>
>>> The hardware will raise page fault if one of the above conditions is not
>>> met. A host should not encounter the RMP fault in normal execution, but
>>> a malicious guest could trick the hypervisor into it. e.g., a guest does
>>> not make the GHCB page shared, on #VMGEXIT, the hypervisor will attempt
>>> to write to GHCB page.
>> Is that the only case which is left? If so, why don't you simply split
>> the direct map for GHCB pages before giving them to the guest? Or, map
>> them with vmap() so that the mapping is always 4k?
> If I read Brijesh's message right, this isn't about 4k. It's about
> the guest violating host expectations about the page type.
>
> I need to go and do a full read of all the relevant specs, but I think
> there's an analogous situation in TDX: if the host touches guest
> private memory, the TDX hardware will get extremely angry (more so
> than AMD hardware). And, if I have understood this patch correctly,
> it's fudging around the underlying bug by intentionally screwing up
> the RMP contents to avoid a page fault. Assuming I've understood
> everything correctly (a big if!), then I think this is backwards. The
> host kernel should not ever access guest memory without a plan in
> place to handle failure. We need real accessors, along the lines of
> copy_from_guest() and copy_to_guest().

You understood it correctly. Its an underlying bug either in host or
guest which may cause the host accessing the guest private pages. If it
happen avoiding the host crash is much preferred (especially when its a
guest kernel bug).