Re: [PATCHv2 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page

From: Sean Christopherson
Date: Fri May 20 2022 - 15:01:08 EST


On Fri, May 20, 2022, Kirill A. Shutemov wrote:
> On Fri, May 20, 2022 at 05:47:30PM +0000, Sean Christopherson wrote:
> > On Fri, May 20, 2022, Kirill A. Shutemov wrote:
> > > @@ -299,6 +301,24 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve)
> > > if (WARN_ON_ONCE(user_mode(regs)))
> > > return -EFAULT;
> > >
> > > + /*
> > > + * load_unaligned_zeropad() relies on exception fixups in case of the
> > > + * word being a page-crosser and the second page is not accessible.
> > > + *
> > > + * In TDX guests, the second page can be shared page and VMM may
> > > + * configure it to trigger #VE.
> > > + *
> > > + * Kernel assumes that #VE on a shared page is MMIO access and tries to
> > > + * decode instruction to handle it. In case of load_unaligned_zeropad()
> > > + * it may result in confusion as it is not MMIO access.
> >
> > The guest kernel can't know that it's not "MMIO", e.g. nothing prevents the host
> > from manually serving accesses to some chunk of shared memory instead of backing
> > the shared chunk with host DRAM.
>
> It would require the guest to access shared memory only with instructions
> that we can deal with. I don't think we have such guarantee.

Ya, it's purely thoereticaly behavior. But panicking if the kernel can't decode
the instruction is really all the guest can do.

> > > + * Check fixup table before trying to handle MMIO.
> >
> > This ordering is wrong, fixup should be done if and only if the instruction truly
> > "faults". E.g. if there's an MMIO access lurking in the kernel that is wrapped in
> > exception fixup, then this will break that usage and provide garbage data on a read
> > and drop any write.
>
> When I tried to trigger the bug, the #VE actually succeed, because
> load_unaligned_zeropad() uses instruction we can decode. But due
> misalignment, the part of that came from non-shared page got overwritten
> with data that came from VMM.

That's a bug in the emulation then. I.e. it needs to deal with page splits.

> I guess we can try to detect misaligned accesses and handle them
> correctly. But it gets complicated and easer to screw up.

At a minimum, it should reject EPT violation #VEs that split pages (on either side).
That's needed irrespective of fixup, e.g. if there's a bug in there kernel that
results in splitting an MMIO region, then panicking is better than data corruption.

Then the post-failure fixup will work, i.e. the load_unaligned_zeropad() will work
like you intend here, without risking spurious fixup.

> Do we ever use exception fixups for MMIO accesses to justify the
> complication?

It's essentially impossible to prove because identifying all the MMIO accesses in
the kernel (and drivers!) is extremely difficult, e.g. see the I/O APIC code which
uses a struct to overlay MMIO.