Re: [PATCH 5/7] x86/fpu: Change fpu->fpregs_active users to fpu->fpstate_active

From: Ingo Molnar
Date: Thu Jan 26 2017 - 10:16:53 EST



* Rik van Riel <riel@xxxxxxxxxx> wrote:

> On Thu, 2017-01-26 at 12:26 +0100, Ingo Molnar wrote:
> > We want to simplify the FPU state machine by eliminating fpu-
> > >fpregs_active,
> > and we can do that because the two state flags (::fpregs_active and
> > ::fpstate_active) are set essentially together.
> >
> > The old lazy FPU switching code used to make a distinction - but
> > there's
> > no lazy switching code anymore, we always switch in an 'eager'
> > fashion.
>
> I've been working for a while now to fix that for
> KVM VCPU threads.
>
> Currently when we switch to a VCPU thread, we first
> load that thread's userspace FPU context, and then
> soon after we save that, and load the guest side FPU
> context.
>
> When a VCPU thread goes idle, we also go through
> two FPU context transitions.
>
> In order to skip the unnecessary FPU context switches
> for VCPU threads, I have been relying on separate
> fpstate_active and fpregs_active states.
>
> Do you have any ideas on how I could implement that
> kind of change without separate fpstate_active and
> fpregs_active states?

So the vCPU threads have host side FPU (user-space) state - whatever FPU state
Qemu has?

One solution to that overhead, without complicating the FPU state machine in any
way, would be to add a facility to drop/reacquire that FPU state.

That should automatically result in zero FPU state switching AFAICS: kernel
threads don't do FPU state switching either.

The vCPU threads sometimes do return to user-space, when they get some deep
exception that needs to be handled by Qemu, right? This aspect shouldn't be a big
problem either, because the regular calling convention is to call (synchronous)
system calls without holding FPU state, right?

I.e. the vCPU /dev/kvm ioctl() could drop/re-map the FPU state with very little
overhead (i.e. no full save/restore required in that code path either), when it
enters/exits vCPU mode.

Thanks,

Ingo