Re: [PATCH v9 12/26] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state

From: Thiago Macieira
Date: Tue Aug 31 2021 - 18:39:26 EST


On Tuesday, 31 August 2021 15:15:55 PDT Len Brown wrote:
> Indeed, I believe that there is universal agreement that a synchronous
> return code
> from a system call is a far superior programming model than decoding
> the location of a failure in a system call. (no, the IP isn't random -- it
> is always the 1st instruction in that thread to touch a TMM register).

That instruction is actually likely going to be a memory load, probably an
LDTILECFG. So the developer will see a crashing instruction with a pointer and
will spend time trying to figure out why that pointer was wrong, when there
was nothing wrong with it.

That's why I suggested (and Chang implemented) a SIGILL for when #NM is
received and the arch_prctl() wasn't previously done. The OOM condition, if
the extra state is dynamically allocated, was meant to stay a SIGSEGV, but
maybe should change to SIGKILL.

On the other hand, if it it's allocated at the syscall, then the kernel can
return -ENOMEM for it (which would allow for graceful degradation) or for a
later clone() syscall starting a new thread (which I don't expect to ever
gracefully degrade).

> decoding the location of the failure in a *signal hander*

That's a separate problem.

We can't be sure that the portion of the userspace doing the alt-stack crash
handler is aware of the portion using AMX. There's no way to enforce this. The
prctl() is a good indication, but I have no clue how high the correlation will
be.

--
Thiago Macieira - thiago.macieira (AT) intel.com
Software Architect - Intel DPG Cloud Engineering