RE: [PATCH v2 7/8] x86/mm: only check uniform after calling mtrr_type_lookup()

From: Michael Kelley (LINUX)
Date: Wed Feb 15 2023 - 14:39:07 EST


From: Juergen Gross <jgross@xxxxxxxx> Sent: Wednesday, February 15, 2023 5:40 AM
>
> On 13.02.23 02:08, Michael Kelley (LINUX) wrote:
> > From: Juergen Gross <jgross@xxxxxxxx> Sent: Wednesday, February 8, 2023 11:22
> PM
> >>
> >> Today pud_set_huge() and pmd_set_huge() test for the MTRR type to be
> >> WB or INVALID after calling mtrr_type_lookup(). Those tests can be
> >> dropped, as the only reason to not use a large mapping would be
> >> uniform being 0. Any MTRR type can be accepted as long as it applies
> >> to the whole memory range covered by the mapping, as the alternative
> >> would only be to map the same region with smaller pages instead using
> >> the same PAT type as for the large mapping.
> >>
> >> Suggested-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> >> Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
> >> ---
> >> arch/x86/mm/pgtable.c | 6 ++----
> >> 1 file changed, 2 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> >> index e4f499eb0f29..7b9c5443d176 100644
> >> --- a/arch/x86/mm/pgtable.c
> >> +++ b/arch/x86/mm/pgtable.c
> >> @@ -721,8 +721,7 @@ int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t
> prot)
> >> u8 mtrr, uniform;
> >>
> >> mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE, &uniform);
> >> - if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> >> - (mtrr != MTRR_TYPE_WRBACK))
> >> + if (!uniform)
> >> return 0;
> >>
> >> /* Bail out if we are we on a populated non-leaf entry: */
> >> @@ -748,8 +747,7 @@ int pmd_set_huge(pmd_t *pmd, phys_addr_t addr,
> pgprot_t prot)
> >> u8 mtrr, uniform;
> >>
> >> mtrr = mtrr_type_lookup(addr, addr + PMD_SIZE, &uniform);
> >> - if ((mtrr != MTRR_TYPE_INVALID) && (!uniform) &&
> >> - (mtrr != MTRR_TYPE_WRBACK)) {
> >> + if (!uniform) {
> >> pr_warn_once("%s: Cannot satisfy [mem %#010llx-%#010llx] with a
> huge-page mapping due to MTRR override.\n",
> >> __func__, addr, addr + PMD_SIZE);
> >
> > I'm seeing this warning trigger in a normal Hyper-V guest (i.e., *not* an
> > SEV-SNP Confidential VM). The original filtering here based on
> > MTRR_TYPE_WRBACK appears to be hiding a bug in mtrr_type_lookup_variable()
> > where it incorrectly thinks an address range matches two different variable
> > MTRRs, and hence clears "uniform".
> >
> > Here are the variable MTRRs in the normal Hyper-V guest with 32 GiBytes
> > of memory:
> >
> > [ 0.043592] MTRR variable ranges enabled:
> > [ 0.048308] 0 base 000000000000 mask FFFF00000000 write-back
> > [ 0.057450] 1 base 000100000000 mask FFF000000000 write-back
>
> I've read the SDM chapter for MTRRs again. Doesn't #1 violate the requirements
> for MTRR settings? The SDM says:
>
> For ranges greater than 4 KBytes, each range must be of length 2^n and its
> base address must be aligned on a 2^n boundary, where n is a value equal to
> or greater than 12. The base-address alignment value cannot be less than its
> length. For example, an 8-KByte range cannot be aligned on a 4-KByte boundary.
> It must be aligned on at least an 8-KByte boundary.
>
> This makes the reasoning below wrong.

Argh. It sure looks like you are right. I just assumed the MTRRs coming from
Hyper-V were good. Shame on me. :-(

I've ping'ed the Hyper-V team to see what they say. But it's hard to see how
they could argue that these MTRRs are correctly formed. The Intel spec is
unambiguous.

Even if Hyper-V agrees that the MTRRs are wrong, a fix will take time to
propagate. In the meantime, it seems like the Linux mitigations could be
any of the following:

1) Keep the test for WB in pud_set_huge() and pmd_set_huge()

2) Remove the test, but have "uniform" set to 1 when multiple MTRRs are
matched but all have the same caching type, which you proposed to
solve Rick Edgecombe's problem. This is likely to paper over the
problem I saw with the Hyper-V MTRRs because the incorrectly matching
MTRRs would all be WB.

3) In *all* Hyper-V VMs (not just Confidential VMs), disable X86_FEATURE_MTRR
and use the new override to set the default type to WB. Hopefully we don't
have to do this, but I can submit a separate patch if it becomes necessary.

Michael