Re: [PATCH] time,virt: resync steal time when guest & host lose sync

From: Wanpeng Li
Date: Thu Aug 18 2016 - 04:38:52 EST


2016-08-13 16:42 GMT+08:00 Ingo Molnar <mingo@xxxxxxxxxx>:
>
> * Rik van Riel <riel@xxxxxxxxxx> wrote:
>
>> On Wed, 10 Aug 2016 07:39:08 +0800
>> Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
>>
>> > The regression is caused by your commit "sched,time: Count actually
>> > elapsed irq & softirq time".
>>
>> Wanpeng, does this patch fix your issue?
>>
>> Paolo, what is your opinion on this issue?
>>
>> I can think of all kinds of ways in which guest and host might lose
>> sync with steal time, from uninitialized values at boot, to guest
>> pause, followed by save to disk, and reload, to live migration, to...
>>
>> ---8<---
>>
>> Subject: time,virt: resync steal time when guest & host lose sync
>>
>> When guest and host wildly disagree on steal time, a guest can
>> do several things:
>> 1) Quickly account all the steal time at once (the kernel did this before
>> 57430218317e ("sched/cputime: Count actually elapsed irq & softirq time"),
>> when steal_account_process_ticks got ULONG_MAX as its maximum value.
>> 2) Stay out of sync for an indeterminate amount of time. This is what the
>> system does today.
>> 3) Sync up the guest value to the host-provided value, without accounting
>> an absurdly large value in the cpu time statistics.
>>
>> This patch makes the kernel do (3), which seems like the right thing
>> to do.
>>
>> The exact value of the threshold use probably does not matter too much,
>> as long as it is long enough to cover all the timer ticks that passed
>> during an idle period, because (irqtime_)account_idle_ticks can process
>> a large amount of time all at once.
>>
>> Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
>> Reported-by: Wanpeng Li <kernellwp@xxxxxxxxx>
>> ---
>> kernel/sched/cputime.c | 12 +++++++++++-
>> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> fails to build on x86 allnoconfig:
>
> kernel/sched/cputime.c:524:10: error: too many arguments to function âsteal_account_process_timeâ

Please try this one. https://lkml.org/lkml/2016/8/16/931

Regards,
Wanpeng Li