Re: [PATCH] time,virt: resync steal time when guest & host lose sync

From: Wanpeng Li
Date: Mon Aug 15 2016 - 21:31:47 EST


2016-08-15 23:00 GMT+08:00 Rik van Riel <riel@xxxxxxxxxx>:
> On Mon, 2016-08-15 at 16:53 +0800, Wanpeng Li wrote:
>> 2016-08-12 23:58 GMT+08:00 Rik van Riel <riel@xxxxxxxxxx>:
>> [...]
>> > Wanpeng, does the patch below work for you?
>>
>> It will break steal time for full dynticks guest, and there is a
>> calltrace of thread_group_cputime_adjusted call stack, RIP is
>> cputime_adjust+0xff/0x130.
>
> How? This patch is equivalent to passing ULONG_MAX to
> steal_account_process_time, which you tried to no ill
> effect before.

https://lkml.org/lkml/2016/6/8/404/ Paolo original suggested to add
the max cputime limit to the vtime, when the cpu is running in nohz
full mode and stop the tick, jiffies will be updated depends on clock
source instead of clock event device in
guest(tick_nohz_update_jiffies() callsite, ktime_get()), so it will
not be affected by lost clock ticks, my patch keeps the limit for
vtime and remove the limit to non-vtime. However, your patch removes
the limit for both scenarios and results in the below calltrace for
vtime.

>
> Do you have the full call trace?

[ 6.929856] divide error: 0000 [#1] SMP
[ 6.934217] Modules linked in:
[ 6.937759] CPU: 3 PID: 57 Comm: kworker/u8:1 Not tainted 4.7.0+ #36
[ 6.946105] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS Bochs 01/01/2011
[ 6.953951] Workqueue: events_unbound call_usermodehelper_exec_work
[ 6.965726] task: ffff8e22b9785040 task.stack: ffff8e22b8b64000
[ 6.970820] RIP: 0010:[<ffffffff870c7b4f>] [<ffffffff870c7b4f>]
cputime_adjust+0xff/0x130
[ 6.981841] RSP: 0000:ffff8e22b8b67b78 EFLAGS: 00010887
[ 6.985946] RAX: a528afff5ad75000 RBX: ffff8e222e243c18 RCX: ffff8e22b8b67c28
[ 7.001166] RDX: 0000000000000000 RSI: 0000000000000296 RDI: 0000000000000000
[ 7.008758] RBP: ffff8e22b8b67ba8 R08: 00000000ffffffff R09: 00000000a528b000
[ 7.015653] R10: 0000000000000000 R11: 0000000000000000 R12: 000000000014a516
[ 7.021376] R13: ffff8e22b8b67bb8 R14: ffff8e222e243c28 R15: ffff8e22b8b67c20
[ 7.035498] FS: 0000000000000000(0000) GS:ffff8e22bac00000(0000)
knlGS:0000000000000000
[ 7.054809] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.066571] CR2: 00000000ffffffff CR3: 000000007ae06000 CR4: 00000000001406e0
[ 7.075162] Stack:
[ 7.090141] ffff8e22b8b67c28 ffff8e222e371ac0 ffff8e22b8b67c20
ffff8e22b8b67c28
[ 7.108512] ffff8e222e371ac0 ffff8e22b8b67cc0 ffff8e22b8b67be8
ffffffff870c8c01
[ 7.123025] 00000000000e0471 fffffffffffdcf32 000000000014a516
ffff8e22b9785040
[ 7.140622] Call Trace:
[ 7.153076] [<ffffffff870c8c01>] thread_group_cputime_adjusted+0x41/0x50
[ 7.160807] [<ffffffff870913bf>] wait_consider_task+0xa4f/0xff0
[ 7.176449] [<ffffffff87090fc1>] ? wait_consider_task+0x651/0xff0
[ 7.186281] [<ffffffff87091a3f>] ? do_wait+0xdf/0x320
[ 7.226606] [<ffffffff87091a7b>] do_wait+0x11b/0x320
[ 7.239670] [<ffffffff87093014>] SyS_wait4+0x64/0xc0
[ 7.245385] [<ffffffff87090180>] ? task_stopped_code+0x50/0x50
[ 7.255924] [<ffffffff870a8470>] call_usermodehelper_exec_work+0x70/0xb0
[ 7.263011] [<ffffffff870acbd0>] process_one_work+0x1e0/0x670
[ 7.273051] [<ffffffff870acb51>] ? process_one_work+0x161/0x670
[ 7.277991] [<ffffffff870ad18b>] worker_thread+0x12b/0x4a0
[ 7.286920] [<ffffffff870ad060>] ? process_one_work+0x670/0x670
[ 7.291745] [<ffffffff870b4011>] kthread+0x101/0x120
[ 7.296878] [<ffffffff878c94cf>] ret_from_fork+0x1f/0x40
[ 7.306511] [<ffffffff870b3f10>] ? kthread_create_on_node+0x250/0x250
[ 7.311985] Code: 4d 39 c8 76 c1 4c 89 d0 48 c1 e8 20 48 85 c0 74
ca 4c 89 c0 49 d1 ea 4d 89 c8 48 d1 e8 49 89 c1 eb 9f 44 89 c8 31 d2
49 0f af c0 <49> f7 f2 4d 89 e2 48 39 f8 48 0f 42 c7 49 29 c2 4d 39 d3
76 0b
[ 7.357565] RIP [<ffffffff870c7b4f>] cputime_adjust+0xff/0x130
[ 7.364633] RSP <ffff8e22b8b67b78>
[ 7.373247] ---[ end trace 76ca7475a22c5d43 ]---