Re: TDX #VE in SYSCALL gap (was: [RFD] x86: Curing the exception and syscall trainwreck in hardware)

From: Andy Lutomirski
Date: Sun Aug 30 2020 - 11:37:58 EST


On Wed, Aug 26, 2020 at 12:16 PM Sean Christopherson
<sean.j.christopherson@xxxxxxxxx> wrote:
>
> On Tue, Aug 25, 2020 at 10:28:53AM -0700, Andy Lutomirski wrote:
> > On Tue, Aug 25, 2020 at 10:19 AM Sean Christopherson
> > <sean.j.christopherson@xxxxxxxxx> wrote:
> > > One thought would be to have the TDX module (thing that runs in SEAM and
> > > sits between the VMM and the guest) provide a TDCALL (hypercall from guest
> > > to TDX module) to the guest that would allow the guest to specify a very
> > > limited number of GPAs that must never generate a #VE, e.g. go straight to
> > > guest shutdown if a disallowed GPA would go pending. That seems doable
> > > from a TDX perspective without incurring noticeable overhead (assuming the
> > > list of GPAs is very small) and should be easy to to support in the guest,
> > > e.g. make a TDCALL/hypercall or two during boot to protect the SYSCALL
> > > page and its scratch data.
> >
> > I guess you could do that, but this is getting gross. The x86
> > architecture has really gone off the rails here.
>
> Does it suck less than using an IST? Honest question.
>
> I will add my voice to the "fix SYSCALL" train, but the odds of that getting
> a proper fix in time to intercept TDX are not good. On the other hand,
> "fixing" the SYSCALL issue in the TDX module is much more feasible, but only
> if we see real value in such an approach as opposed to just using an IST. I
> personally like the idea of a TDX module solution as I think it would be
> simpler for the kernel to implement/support, and would mean we wouldn't need
> to roll back IST usage for #VE if the heavens should part and bestow upon us
> a sane SYSCALL.

There's no such thing as "just" using an IST. Using IST opens a huge
can of works due to its recursion issues.

The TDX module solution is utterly gross but may well suck less than
using an IST.