RE: [PATCHv3 0/3] x86/tdx: Fix one more load_unaligned_zeropad() issue

From: Michael Kelley (LINUX)
Date: Thu Jul 13 2023 - 10:43:51 EST


From: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> Sent: Saturday, July 8, 2023 11:09 PM
>
> On Sat, Jul 08, 2023 at 11:53:08PM +0000, Michael Kelley (LINUX) wrote:
> > From: Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> Sent: Friday, July 7, 2023 7:07 AM
> > >
> > > On Thu, Jul 06, 2023 at 04:48:32PM +0000, Michael Kelley (LINUX) wrote:
> > > > From: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx> Sent: Tuesday, June 6,
> 2023 2:56 AM
> >
> > [snip]
> >
> > >
> > > It only addresses the problem that happens on transition, but
> > > load_unaligned_zeropad() is still a problem for the shared mappings in
> > > general, after transition is complete. Like if load_unaligned_zeropad()
> > > steps from private to shared mapping and shared mapping triggers #VE,
> > > kernel should be able to handle it.
> >
> > I'm showing my ignorance of TDX architectural details, but what's the
> > situation where shared mappings in general can trigger a #VE? How
> > do such situations get handled for references that aren't from
> > load_unaligned_zeropad()?
> >
>
> Shared mappings are under host/VMM control. It can just not map the page
> in shared-ept and trigger ept-violation #VE.

I know you are out on vacation, but let me follow up now for further
discussion when you are back.

Isn't the scenario you are describing a malfunctioning or malicious
host/VMM? Would what you are describing be done as part of normal
operation? Kernel code must have switched the page from private to
shared for some purpose. As soon as that code (which presumably
does not have any entry in the exception table) touches the page, it
would take the #VE and the enter the die path because there's no fixup.
So is there value in having load_unaligned_zeropad() handle the #VE and
succeed where a normal reference would fail?

I'd still like to see the private <-> shared transition code mark the pages
as invalid during the transition, and avoid the possibility of #VE and
similar cases with SEV-SNP. Such approach reduces (eliminates?)
entanglement between CoCo-specific exceptions and
load_unaligned_zeropad(). It also greatly simplifies TD Partition cases
and SEV-SNP cases where a paravisor is used.

But maybe I'm still missing a case where code must handle the #VE
for load_unaligned_zeropad() outside of private <-> shared transitions.

Michael

>
> > > Any comments?
> >
> > This looks good to me. I applied the diff to a TDX VM running on
> > Hyper-V. When a load_unaligned_zeropad() occurs on a page that is
> > transitioning between private and shared, the zeropad fixup is now
> > done correctly via the #VE handler. (This is *without* my RFC patch to
> > mark the pages invalid during a transition.)
>
> Great.
>
> I am at vacation for the next two weeks. I will prepare a proper patch
> when I am back. Feel free to make patch yourself if you feel it is urgent.
>
> --
> Kiryl Shutsemau / Kirill A. Shutemov