Re: [PATCH] time,virt: resync steal time when guest & host lose sync

From: Rik van Riel
Date: Fri Aug 12 2016 - 11:58:12 EST


On Fri, 12 Aug 2016 15:09:00 +0800
Wanpeng Li <kernellwp@xxxxxxxxx> wrote:
> 2016-08-12 10:44 GMT+08:00 Rik van Riel <riel@xxxxxxxxxx>:

> > If you pass ULONG_MAX as the maxtime argument to
> > steal_account_process_time(), does the steal time
> > get accounted properly at 75%?
>
> Yes.

I talked with Paolo this morning, and it turns out that if a guest
misses several timer ticks in a row, they will simply get lost.

That means the functions calling steal_account_process_time may not
know how much CPU time has passed since the last time it was called,
but steal_account_process_time will get a good idea on how much time
the host spent running something else.

Removing the limit, and documenting why, seems like the right way to
fix this bug.

Wanpeng, does the patch below work for you?

Everybody else, does this patch look acceptable?

---8<---
Subject: time,virt: do not limit steal_account_process_time

When a guest is interrupted for a longer amount of time, missed clock
ticks are not redelivered later. Because of that, we should not limit
the amount of steal time accounted to the amount of time that the
calling functions think have passed.

Instead, simply let steal_account_process_time account however much
steal time the host told us elapsed. This can make up timer ticks
that were missed when the host scheduled somebody else.

Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
Reported-by: Wanpeng Li <kernellwp@xxxxxxxxx>
---
kernel/sched/cputime.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 1934f658c036..6f15274940fb 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -263,7 +263,12 @@ void account_idle_time(cputime_t cputime)
cpustat[CPUTIME_IDLE] += (__force u64) cputime;
}

-static __always_inline cputime_t steal_account_process_time(cputime_t maxtime)
+/*
+ * When a guest is interrupted for a longer amount of time, missed clock
+ * ticks are not redelivered later. Due to that, this function may on
+ * occasion account more time than the calling functions think elapsed.
+ */
+static __always_inline cputime_t steal_account_process_time(void)
{
#ifdef CONFIG_PARAVIRT
if (static_key_false(&paravirt_steal_enabled)) {
@@ -273,7 +278,7 @@ static __always_inline cputime_t steal_account_process_time(cputime_t maxtime)
steal = paravirt_steal_clock(smp_processor_id());
steal -= this_rq()->prev_steal_time;

- steal_cputime = min(nsecs_to_cputime(steal), maxtime);
+ steal_cputime = nsecs_to_cputime(steal);
account_steal_time(steal_cputime);
this_rq()->prev_steal_time += cputime_to_nsecs(steal_cputime);

@@ -290,7 +295,7 @@ static inline cputime_t account_other_time(cputime_t max)
{
cputime_t accounted;

- accounted = steal_account_process_time(max);
+ accounted = steal_account_process_time();

if (accounted < max)
accounted += irqtime_account_hi_update(max - accounted);
@@ -486,7 +491,7 @@ void account_process_tick(struct task_struct *p, int user_tick)
}

cputime = cputime_one_jiffy;
- steal = steal_account_process_time(cputime);
+ steal = steal_account_process_time();

if (steal >= cputime)
return;