Re: [PATCH RESEND] KVM: X86: Fix load bad host fpu state

From: Paolo Bonzini
Date: Tue Dec 12 2017 - 11:16:44 EST


On 12/12/2017 06:40, Wanpeng Li wrote:
> 2017-12-12 11:36 GMT+08:00 Peter Xu <peterx@xxxxxxxxxx>:
>> On Tue, Dec 12, 2017 at 05:51:26AM +0800, Wanpeng Li wrote:
>>> 2017-12-12 4:48 GMT+08:00 David Hildenbrand <david@xxxxxxxxxx>:
>>>> On 10.12.2017 22:44, Wanpeng Li wrote:
>>>>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>>>
>>>>> ------------[ cut here ]------------
>>>>> Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers.
>>>>> WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90
>>>>> CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G B OE 4.15.0-rc2+ #10
>>>>> RIP: 0010:ex_handler_fprestore+0x88/0x90
>>>>> Call Trace:
>>>>> fixup_exception+0x4e/0x60
>>>>> do_general_protection+0xff/0x270
>>>>> general_protection+0x22/0x30
>>>>> RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm]
>>>>> RSP: 0018:ffff8803d5627810 EFLAGS: 00010246
>>>>> kvm_vcpu_reset+0x3b4/0x3c0 [kvm]
>>>>> kvm_apic_accept_events+0x1c0/0x240 [kvm]
>>>>> kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm]
>>>>> kvm_vcpu_ioctl+0x479/0x880 [kvm]
>>>>> do_vfs_ioctl+0x142/0x9a0
>>>>> SyS_ioctl+0x74/0x80
>>>>> do_syscall_64+0x15f/0x600
>>>>>
>>>>> This can be reproduced by running any testcase in kvm-unit-tests since
>>>>> the qemu userspace FPU context is not initialized, which results in the
>>>>> init path from kvm_apic_accept_events() will load/put qemu userspace
>>>>> FPU context w/o initialized. In addition, w/o this splatting we still
>>>>> should initialize vcpu->arch.user_fpu instead of current->thread.fpu.
>>>>> This patch fixes it by initializing qemu user FPU context if it is
>>>>> uninitialized before KVM_RUN.
>>>>>
>>>>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
>>>>> Cc: Radim KrÄmÃÅ <rkrcmar@xxxxxxxxxx>
>>>>> Cc: Rik van Riel <riel@xxxxxxxxxx>
>>>>> Cc: stable@xxxxxxxxxxxxxxx
>>>>> Fixes: f775b13eedee (x86,kvm: move qemu/guest FPU switching out to vcpu_run)
>>>>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx>
>>>>> ---
>>>>> arch/x86/kvm/x86.c | 7 +++++--
>>>>> 1 file changed, 5 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>>> index a92b22f..063a643 100644
>>>>> --- a/arch/x86/kvm/x86.c
>>>>> +++ b/arch/x86/kvm/x86.c
>>>>> @@ -7273,10 +7273,13 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
>>>>>
>>>>> int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
>>>>> {
>>>>> - struct fpu *fpu = &current->thread.fpu;
>>>>> + struct fpu *fpu = &vcpu->arch.user_fpu;
>>>>> int r;
>>>>>
>>>>> - fpu__initialize(fpu);
>>>>> + if (!fpu->initialized) {
>>>>> + fpstate_init(&fpu->state);
>>>>> + fpu->initialized = 1;
>>>>> + }
>>>>
>>>> Is there a chance of keeping using fpu__initialize() ? Duplicating the
>>>> code is ugly.
>>>
>>> There is a warning in fpu__initialize() which results in just
>>> current->thread.fpu can take advantage of.
>>>
>>>>
>>>> E.g. can't we simply initialize that in kvm_load_guest_fpu?
>>>
>>> We still miss to initialize qemu user FPU context for the above calltrace.
>>
>> IMHO we should not really init the user FPU since we should always
>> load FPU then put FPU. The problem now is that for vcpus that with
>> vcpu_id>1 we'll first put the FPU before loading it. So, instead how
>> about we make sure we load the FPU first even for non-bootstrap vcpus?
>> And we can actually drop fpu__initialize() call, like:
>
> It will introduce extra overhead for all the cases which can't enter
> into vcpu_run(), I think move
> fpstate_init(&vcpu->arch.user_fpu.state); to fx_init() is better.

Those cases with a sleeping AP are so rare that they don't matter. They
will occur only a few times per boot. Peter's solution is right, I've
queued it.

Thanks to both!

Paolo

Paolo