Re: [PATCH v5 06/14] x86/ioremap: Support hypervisor specified range to map as encrypted

From: Sean Christopherson
Date: Fri Feb 10 2023 - 18:47:40 EST


On Fri, Feb 10, 2023, Michael Kelley (LINUX) wrote:
> From: Sean Christopherson <seanjc@xxxxxxxxxx> Sent: Friday, February 10, 2023 12:58 PM
> >
> > On Fri, Feb 10, 2023, Sean Christopherson wrote:
> > > On Fri, Feb 10, 2023, Dave Hansen wrote:
> > > > On 2/10/23 11:36, Borislav Petkov wrote:
> > > > >> One approach is to go with the individual device attributes for now.>> If the list
> > does grow significantly, there will probably be patterns
> > > > >> or groupings that we can't discern now. We could restructure into
> > > > >> larger buckets at that point based on those patterns/groupings.
> > > > > There's a reason the word "platform" is in cc_platform_has(). Initially
> > > > > we wanted to distinguish attributes of the different platforms. So even
> > > > > if y'all don't like CC_ATTR_PARAVISOR, that is what distinguishes this
> > > > > platform and it *is* one platform.
> > > > >
> > > > > So call it CC_ATTR_SEV_VTOM as it uses that technology or whatever. But
> > > > > call it like the platform, not to mean "I need this functionality".
> > > >
> > > > I can live with that. There's already a CC_ATTR_GUEST_SEV_SNP, so it
> > > > would at least not be too much of a break from what we already have.
> > >
> > > I'm fine with CC_ATTR_SEV_VTOM, assuming the proposal is to have something like:
> > >
> > > static inline bool is_address_range_private(resource_size_t addr)
> > > {
> > > if (cc_platform_has(CC_ATTR_SEV_VTOM))
> > > return is_address_below_vtom(addr);
> > >
> > > return false;
> > > }
> > >
> > > i.e. not have SEV_VTOM mean "I/O APIC and vTPM are private". Though I don't see
> > > the point in making it SEV vTOM specific or using a flag. Despite what any of us
> > > think about TDX paravisors, it's completely doable within the confines of TDX to
> > > have an emulated device reside in the private address space. E.g. why not
> > > something like this?
> > >
> > > static inline bool is_address_range_private(resource_size_t addr)
> > > {
> > > return addr < cc_platform_private_end;
> > > }
> > >
> > > where SEV fills in "cc_platform_private_end" when vTOM is enabled, and TDX does
> > > the same. Or wrap cc_platform_private_end in a helper, etc.
> >
> > Gah, forgot that the intent with TDX is to enumerate devices in their legacy
> > address spaces. So a TDX guest couldn't do this by default, but if/when Hyper-V
> > or some other hypervisor moves I/O APIC, vTPM, etc... into the TCB, the common
> > code would just work and only the hypervisor-specific paravirt code would need
> > to change.
> >
> > Probably need a more specific name than is_address_range_private() though, e.g.
> > is_mmio_address_range_private()?
>
> Maybe I'm not understanding what you are proposing, but in an SEV-SNP
> VM using vTOM, devices like the IO-APIC and TPM live at their usual guest
> physical addresses.

Ah, so as the cover letter says, the intent really is to treat vTOM as an
attribute bit. Sorry, I got confused by Boris's comment:

: What happens if the next silly HV guest scheme comes along and they do
: need more and different ones?

Based on that comment, I assumed the proposal to use CC_ATTR_SEV_VTOM was intended
to be a generic range-based thing, but it sounds like that's not the case.

IMO, using CC_ATTR_SEV_VTOM to infer anything about the state of I/O APIC or vTPM
is wrong. vTOM as a platform feature effectively says "physical address bit X
controls private vs. shared" (ignoring weird usage of vTOM). vTOM does not mean
I/O APIC and vTPM are private, that's very much a property of Hyper-V's current
generation of vTOM-based VMs.

Hardcoding this in the guest feels wrong. Ideally, we would have a way to enumerate
that a device is private/trusted, e.g. through ACPI. I'm guessing we already
missed the boat on that, so the next best thing is to do something like Michael
originally proposed in this patch and shove the "which devices are private" logic
into hypervisor-specific code, i.e. let Hyper-V figure out how to enumerate to its
guests which devices are shared.

I agree with Boris' comment that a one-off "other encrypted range" is a hack, but
that's just an API problem. The kernel already has hypervisor specific hooks (and
for SEV-ES even), why not expand that? That way figuring out which devices are
private is wholly contained in Hyper-V code, at least until there's a generic
solution for enumerating private devices, though that seems unlikely to happen
and will be a happy problem to solve if it does come about.

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index a868b76cd3d4..08f65ed439d9 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2682,11 +2682,16 @@ static void io_apic_set_fixmap(enum fixed_addresses idx, phys_addr_t phys)
{
pgprot_t flags = FIXMAP_PAGE_NOCACHE;

- /*
- * Ensure fixmaps for IOAPIC MMIO respect memory encryption pgprot
- * bits, just like normal ioremap():
- */
- flags = pgprot_decrypted(flags);
+ if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
+ /*
+ * Ensure fixmaps for IOAPIC MMIO respect memory encryption pgprot
+ * bits, just like normal ioremap():
+ */
+ if (x86_platform.hyper.is_private_mmio(phys))
+ flags = pgprot_encrypted(flags);
+ else
+ flags = pgprot_decrypted(flags);
+ }

__set_fixmap(idx, phys, flags);
}
diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 6453fbaedb08..0baec766b921 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -116,6 +116,9 @@ static void __ioremap_check_other(resource_size_t addr, struct ioremap_desc *des
if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
return;

+ if (x86_platform.hyper.is_private_mmio(addr))
+ desc->flags |= IORES_MAP_ENCRYPTED;
+
if (!IS_ENABLED(CONFIG_EFI))
return;