RE: Lazy FPU restoration / moving kernel_fpu_end() to context switch

From: David Laight
Date: Tue Jun 19 2018 - 07:42:51 EST


From: Andy Lutomirski
> Sent: 15 June 2018 19:54
> On Fri, Jun 15, 2018 at 11:50 AM Dave Hansen
> <dave.hansen@xxxxxxxxxxxxxxx> wrote:
> >
> > On 06/15/2018 11:31 AM, Andy Lutomirski wrote:
> > > for (thing) {
> > > kernel_fpu_begin();
> > > encrypt(thing);
> > > kernel_fpu_end();
> > > }
> >
> > Don't forget that the processor has optimizations for this, too. The
> > "modified optimization" will notice that between:
> >
> > kernel_fpu_end(); -> XRSTOR
> > and
> > kernel_fpu_start(); -> XSAVE(S|OPT)
> >
> > the processor has not modified the states. It'll skip doing any writes
> > of the state. Doing what Andy is describing is still way better than
> > letting the processor do it, but you should just know up front that this
> > may not be as much of a win as you would expect.
>
> Even with the modified optimization, kernel_fpu_end() still needs to
> reload the state that was trashed by the kernel FPU use. If the
> kernel is using something like AVX512 state, then kernel_fpu_end()
> will transfer an enormous amount of data no matter how clever the CPU
> is. And I think I once measured XSAVEOPT taking a hundred cycles or
> so even when RFBM==0, so it's not exactly super fast.

If the kernel was entered by a system call do you need to save the AVX512
state at all?
IIRC the registers are all defined as 'called saved' so there is no expectation
that they will be saved across the syscall wrapper function call.
All you need to do is ensure that 'kernel' values aren't passed back to userspace.
There is a single instruction to zero all the AVX512 registers.

David