Re: [PATCH v3 10/11] xen: Update sched clock offset to avoid system instability in hibernation

From: boris . ostrovsky
Date: Sun Sep 13 2020 - 13:55:30 EST



On 8/21/20 6:30 PM, Anchal Agarwal wrote:
> Save/restore xen_sched_clock_offset in syscore suspend/resume during PM
> hibernation. Commit '867cefb4cb1012: ("xen: Fix x86 sched_clock() interface
> for xen")' fixes xen guest time handling during migration. A similar issue
> is seen during PM hibernation when system runs CPU intensive workload.
> Post resume pvclock resets the value to 0 however, xen sched_clock_offset
> is never updated. System instability is seen during resume from hibernation
> when system is under heavy CPU load. Since xen_sched_clock_offset is not
> updated, system does not see the monotonic clock value and the scheduler
> would then think that heavy CPU hog tasks need more time in CPU, causing
> the system to freeze


I don't think you need to explain why non-monotonic clocks are bad.
(and, in fact, the same applies to commit message in patch 8)


>
> Signed-off-by: Anchal Agarwal <anchalag@xxxxxxxxxx>
> ---
> arch/x86/xen/suspend.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/suspend.c b/arch/x86/xen/suspend.c
> index b12db6966af6..a62e08a11681 100644
> --- a/arch/x86/xen/suspend.c
> +++ b/arch/x86/xen/suspend.c
> @@ -98,8 +98,9 @@ static int xen_syscore_suspend(void)
> return 0;
>
> gnttab_suspend();
> -
> xen_manage_runstate_time(-1);
> + xen_save_sched_clock_offset();
> +
> xrfp.domid = DOMID_SELF;
> xrfp.gpfn = __pa(HYPERVISOR_shared_info) >> PAGE_SHIFT;
>
> @@ -120,6 +121,12 @@ static void xen_syscore_resume(void)
> xen_hvm_map_shared_info();
>
> pvclock_resume();
> +
> + /*
> + * Restore xen_sched_clock_offset during resume to maintain
> + * monotonic clock value
> + */


I'd drop this comment, we know what the call does.


-boris


> + xen_restore_sched_clock_offset();
> xen_manage_runstate_time(0);
> gnttab_resume();
> }