Re: [PATCH v2 0/4] implement vcpu preempted check

From: Wanpeng Li
Date: Wed Jul 06 2016 - 09:03:09 EST


2016-07-06 20:28 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
>
>
> On 06/07/2016 14:08, Wanpeng Li wrote:
>> 2016-07-06 18:44 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>:
>>>
>>>
>>> On 06/07/2016 08:52, Peter Zijlstra wrote:
>>>> On Tue, Jun 28, 2016 at 10:43:07AM -0400, Pan Xinhui wrote:
>>>>> change fomr v1:
>>>>> a simplier definition of default vcpu_is_preempted
>>>>> skip mahcine type check on ppc, and add config. remove dedicated macro.
>>>>> add one patch to drop overload of rwsem_spin_on_owner and mutex_spin_on_owner.
>>>>> add more comments
>>>>> thanks boqun and Peter's suggestion.
>>>>>
>>>>> This patch set aims to fix lock holder preemption issues.
>>>>>
>>>>> test-case:
>>>>> perf record -a perf bench sched messaging -g 400 -p && perf report
>>>>>
>>>>> 18.09% sched-messaging [kernel.vmlinux] [k] osq_lock
>>>>> 12.28% sched-messaging [kernel.vmlinux] [k] rwsem_spin_on_owner
>>>>> 5.27% sched-messaging [kernel.vmlinux] [k] mutex_unlock
>>>>> 3.89% sched-messaging [kernel.vmlinux] [k] wait_consider_task
>>>>> 3.64% sched-messaging [kernel.vmlinux] [k] _raw_write_lock_irq
>>>>> 3.41% sched-messaging [kernel.vmlinux] [k] mutex_spin_on_owner.is
>>>>> 2.49% sched-messaging [kernel.vmlinux] [k] system_call
>>>>>
>>>>> We introduce interface bool vcpu_is_preempted(int cpu) and use it in some spin
>>>>> loops of osq_lock, rwsem_spin_on_owner and mutex_spin_on_owner.
>>>>> These spin_on_onwer variant also cause rcu stall before we apply this patch set
>>>>
>>>> Paolo, could you help out with an (x86) KVM interface for this?
>>>
>>> If it's just for spin loops, you can check if the version field in the
>>> steal time structure has changed.
>>
>> Steal time will not be updated until ahead of next vmentry except
>> wrmsr MSR_KVM_STEAL_TIME. So it can't represent it is preempted
>> currently, right?
>
> Hmm, you're right. We can use bit 0 of struct kvm_steal_time's flags to
> indicate that pad[0] is a "VCPU preempted" field; if pad[0] is 1, the
> VCPU has been scheduled out since the last time the guest reset the bit.
> The guest can use an xchg to test-and-clear it. The bit can be
> accessed at any time, independent of the version field.

I will try to implement it tomorrow, thanks for your proposal. :)

Regards,
Wanpeng Li