RE: [PATCH v5 06/14] x86/ioremap: Support hypervisor specified range to map as encrypted

From: Michael Kelley (LINUX)
Date: Tue Feb 14 2023 - 02:45:58 EST


From: Sean Christopherson <seanjc@xxxxxxxxxx> Sent: Friday, February 10, 2023 3:47 PM
>
> On Fri, Feb 10, 2023, Michael Kelley (LINUX) wrote:
> > From: Sean Christopherson <seanjc@xxxxxxxxxx> Sent: Friday, February 10, 2023
> 12:58 PM
> > >
> > > On Fri, Feb 10, 2023, Sean Christopherson wrote:
> > > > On Fri, Feb 10, 2023, Dave Hansen wrote:
> > > > > On 2/10/23 11:36, Borislav Petkov wrote:
> > > > > >> One approach is to go with the individual device attributes for now.>> If the list
> > > does grow significantly, there will probably be patterns
> > > > > >> or groupings that we can't discern now. We could restructure into
> > > > > >> larger buckets at that point based on those patterns/groupings.
> > > > > > There's a reason the word "platform" is in cc_platform_has(). Initially
> > > > > > we wanted to distinguish attributes of the different platforms. So even
> > > > > > if y'all don't like CC_ATTR_PARAVISOR, that is what distinguishes this
> > > > > > platform and it *is* one platform.
> > > > > >
> > > > > > So call it CC_ATTR_SEV_VTOM as it uses that technology or whatever. But
> > > > > > call it like the platform, not to mean "I need this functionality".
> > > > >
> > > > > I can live with that. There's already a CC_ATTR_GUEST_SEV_SNP, so it
> > > > > would at least not be too much of a break from what we already have.
> > > >
> > > > I'm fine with CC_ATTR_SEV_VTOM, assuming the proposal is to have something
> like:
> > > >
> > > > static inline bool is_address_range_private(resource_size_t addr)
> > > > {
> > > > if (cc_platform_has(CC_ATTR_SEV_VTOM))
> > > > return is_address_below_vtom(addr);
> > > >
> > > > return false;
> > > > }
> > > >
> > > > i.e. not have SEV_VTOM mean "I/O APIC and vTPM are private". Though I don't
> see
> > > > the point in making it SEV vTOM specific or using a flag. Despite what any of us
> > > > think about TDX paravisors, it's completely doable within the confines of TDX to
> > > > have an emulated device reside in the private address space. E.g. why not
> > > > something like this?
> > > >
> > > > static inline bool is_address_range_private(resource_size_t addr)
> > > > {
> > > > return addr < cc_platform_private_end;
> > > > }
> > > >
> > > > where SEV fills in "cc_platform_private_end" when vTOM is enabled, and TDX does
> > > > the same. Or wrap cc_platform_private_end in a helper, etc.
> > >
> > > Gah, forgot that the intent with TDX is to enumerate devices in their legacy
> > > address spaces. So a TDX guest couldn't do this by default, but if/when Hyper-V
> > > or some other hypervisor moves I/O APIC, vTPM, etc... into the TCB, the common
> > > code would just work and only the hypervisor-specific paravirt code would need
> > > to change.
> > >
> > > Probably need a more specific name than is_address_range_private() though, e.g.
> > > is_mmio_address_range_private()?
> >
> > Maybe I'm not understanding what you are proposing, but in an SEV-SNP
> > VM using vTOM, devices like the IO-APIC and TPM live at their usual guest
> > physical addresses.
>
> Ah, so as the cover letter says, the intent really is to treat vTOM as an
> attribute bit. Sorry, I got confused by Boris's comment:
>
> : What happens if the next silly HV guest scheme comes along and they do
> : need more and different ones?
>
> Based on that comment, I assumed the proposal to use CC_ATTR_SEV_VTOM was
> intended
> to be a generic range-based thing, but it sounds like that's not the case.
>
> IMO, using CC_ATTR_SEV_VTOM to infer anything about the state of I/O APIC or vTPM
> is wrong. vTOM as a platform feature effectively says "physical address bit X
> controls private vs. shared" (ignoring weird usage of vTOM). vTOM does not mean
> I/O APIC and vTPM are private, that's very much a property of Hyper-V's current
> generation of vTOM-based VMs.
>
> Hardcoding this in the guest feels wrong. Ideally, we would have a way to enumerate
> that a device is private/trusted, e.g. through ACPI. I'm guessing we already
> missed the boat on that, so the next best thing is to do something like Michael
> originally proposed in this patch and shove the "which devices are private" logic
> into hypervisor-specific code, i.e. let Hyper-V figure out how to enumerate to its
> guests which devices are shared.
>
> I agree with Boris' comment that a one-off "other encrypted range" is a hack, but
> that's just an API problem. The kernel already has hypervisor specific hooks (and
> for SEV-ES even), why not expand that? That way figuring out which devices are
> private is wholly contained in Hyper-V code, at least until there's a generic
> solution for enumerating private devices, though that seems unlikely to happen
> and will be a happy problem to solve if it does come about.

Yes, this is definitely a cleaner way to implement what I was originally proposing.
I can make it work if there's agreement to take this approach.

Michael

>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index a868b76cd3d4..08f65ed439d9 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -2682,11 +2682,16 @@ static void io_apic_set_fixmap(enum fixed_addresses idx,
> phys_addr_t phys)
> {
> pgprot_t flags = FIXMAP_PAGE_NOCACHE;
>
> - /*
> - * Ensure fixmaps for IOAPIC MMIO respect memory encryption pgprot
> - * bits, just like normal ioremap():
> - */
> - flags = pgprot_decrypted(flags);
> + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT)) {
> + /*
> + * Ensure fixmaps for IOAPIC MMIO respect memory encryption pgprot
> + * bits, just like normal ioremap():
> + */
> + if (x86_platform.hyper.is_private_mmio(phys))
> + flags = pgprot_encrypted(flags);
> + else
> + flags = pgprot_decrypted(flags);
> + }
>
> __set_fixmap(idx, phys, flags);
> }
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index 6453fbaedb08..0baec766b921 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -116,6 +116,9 @@ static void __ioremap_check_other(resource_size_t addr, struct
> ioremap_desc *des
> if (!cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
> return;
>
> + if (x86_platform.hyper.is_private_mmio(addr))
> + desc->flags |= IORES_MAP_ENCRYPTED;
> +
> if (!IS_ENABLED(CONFIG_EFI))
> return;
>