Re: [PATCH] timers/nohz: Update nohz load even if tick already stopped

From: Scott Wood
Date: Fri Nov 01 2019 - 01:11:45 EST


On Wed, 2019-10-30 at 14:31 +0100, Peter Zijlstra wrote:
> On Wed, Oct 30, 2019 at 03:48:26AM -0500, Scott Wood wrote:
> > On Tue, 2019-10-29 at 11:05 +0100, Peter Zijlstra wrote:
> > > @@ -3686,6 +3688,7 @@ static void sched_tick_remote(struct work_struct
> > > *work)
> > > curr->sched_class->task_tick(rq, curr, 0);
> > >
> > > out_unlock:
> > > + calc_load_nohz_remote(cpu);
> > > rq_unlock_irq(rq, &rf);
> >
> > This gets skipped when the cpu is idle, so it still misses the update.
>
> Oh argh! that's a bit radical of the remote tick. The normal tick runs
> just fine on idle CPUs, so lets mirror that.
>
> How's this then?
>
> ---
> diff --git a/include/linux/sched/nohz.h b/include/linux/sched/nohz.h
> index 1abe91ff6e4a..6d67e9a5af6b 100644
> --- a/include/linux/sched/nohz.h
> +++ b/include/linux/sched/nohz.h
> @@ -15,9 +15,11 @@ static inline void nohz_balance_enter_idle(int cpu) { }
>
> #ifdef CONFIG_NO_HZ_COMMON
> void calc_load_nohz_start(void);
> +void calc_load_nohz_remote(struct rq *rq);
> void calc_load_nohz_stop(void);
> #else
> static inline void calc_load_nohz_start(void) { }
> +static inline void calc_load_nohz_remote(struct rq *rq) { }
> static inline void calc_load_nohz_stop(void) { }
> #endif /* CONFIG_NO_HZ_COMMON */
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index eb42b71faab9..d02d1b8f40af 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3660,21 +3660,17 @@ static void sched_tick_remote(struct work_struct
> *work)
> u64 delta;
> int os;
>
> - /*
> - * Handle the tick only if it appears the remote CPU is running in
> full
> - * dynticks mode. The check is racy by nature, but missing a tick or
> - * having one too much is no big deal because the scheduler tick
> updates
> - * statistics and checks timeslices in a time-independent way,
> regardless
> - * of when exactly it is running.
> - */
> - if (idle_cpu(cpu) || !tick_nohz_tick_stopped_cpu(cpu))
> + if (!tick_nohz_tick_stopped_cpu(cpu))
> goto out_requeue;
>
> rq_lock_irq(rq, &rf);
> - curr = rq->curr;
> - if (is_idle_task(curr) || cpu_is_offline(cpu))
> + /*
> + * We must not call calc_load_nohz_remote() when not in NOHZ mode.
> + */
> + if (cpu_is_offline(cpu) || !tick_nohz_tick_stopped(cpu))
> goto out_unlock;

Needs to be tick_nohz_tick_stopped_cpu(cpu)

After fixing that, I get:

[ 7.439068] WARNING: CPU: 20 PID: 7 at /home/root/linux/kernel/sched/core.c:3681 sched_tick_remote+0x132/0x150
[ 7.439068] Modules linked in:
[ 7.439068] CPU: 20 PID: 7 Comm: kworker/u209:0 Not tainted 5.4.0-rc5.std+ #15
[ 7.439068] Hardware name: Intel Corporation S2600BT/S2600BT, BIOS SE5C620.86B.01.00.0763.022420181017 02/24/2018
[ 7.439068] Workqueue: events_unbound sched_tick_remote
[ 7.446308] pci_bus 0000:9f: resource 1 [mem 0xe6a00000-0xe6bfffff]
[ 7.455068] RIP: 0010:sched_tick_remote+0x132/0x150
[ 7.455068] Code: 00 e9 b2 fd fe ff 0f 0b e9 46 ff ff ff 83 f8 02 89 c2 74 d3 8d 4a ff 89 d0 f0 0f b1 0e 0f 94 c1 84 c9 0f 85 23 ff ff ff eb e3 <0f> 0b eb 9a 80 3d 9c d6 2c 01 00 0f 1f 00 0f 85 71 ff ff ff e8 05
[ 7.455068] RSP: 0000:ffffc9000c683e58 EFLAGS: 00010002
[ 7.455068] RAX: 00000000e7061da1 RBX: ffff8897e026e688 RCX: 0000000181f93295
[ 7.455068] RDX: 00000000b2d05e00 RSI: ffff8897e0269e50 RDI: 0000000000000004
[ 7.455068] RBP: ffff8881004c0000 R08: ffff8e8191a2b423 R09: 0000000000000000
[ 7.455068] R10: 0000000000000010 R11: 0000000000000018 R12: ffff8897e0269240
[ 7.455068] R13: ffff8897e0240000 R14: 0000000000000000 R15: ffff888107edc2e8
[ 7.455068] FS: 0000000000000000(0000) GS:ffff8897e0700000(0000) knlGS:0000000000000000
[ 7.455068] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.455068] CR2: 0000000000000000 CR3: 000000303e60a001 CR4: 00000000007606e0
[ 7.455068] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7.459338] pci_bus 0000:9f: resource 2 [mem 0x3a0000000000-0x3a00001fffff 64bit pref]
[ 7.465068] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 7.465068] PKRU: 55555554
[ 7.465068] Call Trace:
[ 7.465068] process_one_work+0x165/0x3c0
[ 7.465068] worker_thread+0x46/0x3d0
[ 7.465068] kthread+0xf8/0x130
[ 7.465068] ? process_one_work+0x3c0/0x3c0
[ 7.476788] pci_bus 0000:a0: resource 1 [mem 0xe6c00000-0xe6dfffff]
[ 7.465068] ? kthread_bind+0x10/0x10
[ 7.465068] ret_from_fork+0x35/0x40

-Scott