Re: [PATCH v6] x86: load FPU registers on return to userland

From: Dave Hansen
Date: Tue Jan 15 2019 - 14:46:20 EST


On 1/15/19 4:44 AM, David Laight wrote:
> Once this is done it might be worth while adding a parameter to
> kernel_fpu_begin() to request the registers only when they don't
> need saving.
> This would benefit code paths where the gains are reasonable but not massive.
>
> The return value from kernel_fpu_begin() ought to indicate which
> registers are available - none, SSE, SSE2, AVX, AVX512 etc.
> So code can use an appropriate implementation.
> (I've not looked to see if this is already the case!)

Yeah, it would be sane to have both a mask passed, and returned, say:

got = kernel_fpu_begin(XFEATURE_MASK_AVX512, NO_XSAVE_ALLOWED);

if (got == XFEATURE_MASK_AVX512)
do_avx_512_goo();
else
do_integer_goo();

kernel_fpu_end(got)

Then, kernel_fpu_begin() can actually work without even *doing* an XSAVE:

/* Do we have to save state for anything in 'ask_mask'? */
if (all_states_are_init(ask_mask))
return ask_mask;

Then kernel_fpu_end() just needs to zero out (re-init) the state, which
it can do with XRSTORS and a careful combination of XSTATE_BV and the
requested feature bitmap (RFBM).

This is all just optimization, though.