Re: How should we handle illegal task FPU state?

From: Andy Lutomirski
Date: Thu Oct 01 2020 - 18:05:03 EST


On Thu, Oct 1, 2020 at 2:50 PM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>
> On 10/1/20 1:58 PM, Sean Christopherson wrote:
> > One thought for a lowish effort approach to pave the way for CET would be to
> > try XRSTORS multiple times in switch_fpu_return(). If the first try fails,
> > then WARN, init non-supervisor state and try a second time, and if _that_ fails
> > then kill the task. I.e. do the minimum effort to play nice with bad FPU
> > state, but don't let anything "accidentally" turn off CET.
>
> I'm not sure we should ever keep running userspace after an XRSTOR*
> failure. For MPX, this might have provided a nice, additional vector
> for an attacker to turn off MPX. Same for pkeys if we didn't correctly
> differentiate between the hardware init state versus the "software init"
> state that we keep in init_task.
>
> What's the advantage of letting userspace keep running after we init its
> state? That it _might_ be able to recover?

I suppose we can kill userspace and change that behavior only if
someone complains. I still think it would be polite to try to dump
core, but that could be tricky with the current code structure. I'll
try to whip up a patch. Maybe I'll add a debugfs file to trash MXCSR
for testing.