Re: [PATCH v5 28/34] x86/fred: fixup fault on ERETU by jumping to fred_entrypoint_user

From: H. Peter Anvin
Date: Fri Mar 17 2023 - 17:25:06 EST


On March 17, 2023 6:02:52 AM PDT, Lai Jiangshan <jiangshanlai@xxxxxxxxx> wrote:
>On Fri, Mar 17, 2023 at 5:56 PM <andrew.cooper3@xxxxxxxxxx> wrote:
>>
>> On 17/03/2023 9:39 am, Lai Jiangshan wrote:
>> >> +#ifdef CONFIG_X86_FRED
>> >> +static bool ex_handler_eretu(const struct exception_table_entry *fixup,
>> >> + struct pt_regs *regs, unsigned long error_code)
>> >> +{
>> >> + struct pt_regs *uregs = (struct pt_regs *)(regs->sp - offsetof(struct pt_regs, ip));
>> >> + unsigned short ss = uregs->ss;
>> >> + unsigned short cs = uregs->cs;
>> >> +
>> >> + fred_info(uregs)->edata = fred_event_data(regs);
>> >> + uregs->ssx = regs->ssx;
>> >> + uregs->ss = ss;
>> >> + uregs->csx = regs->csx;
>> >> + uregs->current_stack_level = 0;
>> >> + uregs->cs = cs;
>> > Hello
>> >
>> > If the ERETU instruction had tried to return from NMI to ring3 and just faulted,
>> > is NMI still blocked?
>> >
>> > We know that IRET unconditionally enables NMI, but I can't find any clue in the
>> > FRED's manual.
>> >
>> > In the pseudocode of ERETU in the manual, it seems that NMI is only enabled when
>> > ERETU succeeds with bit28 in csx set. If so, this code will fail to reenable
>> > NMI if bit28 is not explicitly re-set in csx.
>>
>> IRET clearing NMI blocking is the source of an immense amount of grief,
>> and ultimately the reason why Linux and others can't use supervisor
>> shadow stacks at the moment.
>>
>> Changing this property, so NMIs only get unblocked on successful
>> execution of an ERET{S,U}, was a key demand of the FRED spec.
>>
>> i.e. until you have successfully ERET*'d, you're still logically in the
>> NMI handler and NMIs need to remain blocked even when handling the #GP
>> from a bad ERET.
>>
>
>Handling of the #GP for a bad ERETU can be rescheduled. It is not
>OK to reschedule with NMI blocked.
>
>I think "regs->nmi = 1;" (not uregs->nmi) can fix the problem.
>

You are quite correct, since what we want here is to emulate having taken the fault in user space – which meant that NMI would have been re-enabled by the never-executed return.

I think the "best" solution is:

regs->nmi = uregs->nmi;
uregs->nmi = 0;

... as enabling NMI is expected to have a performance penalty (being the less common case, an implementation which has a performance difference at all would want to optimize the non-NMI case), and I believe the compiler should be able to at least mostly fold those operations into ones it is doing anyway.