Re: [PATCH 2/2] sched: Offload wakee task activation if it the wakee is descheduling

From: Pavan Kondeti
Date: Wed May 27 2020 - 06:11:10 EST


On Sun, May 24, 2020 at 09:29:56PM +0100, Mel Gorman wrote:
> The patch "sched: Optimize ttwu() spinning on p->on_cpu" avoids spinning
> on p->on_rq when the task is descheduling but only if the wakee is on
> a CPU that does not share cache with the waker. This patch offloads the
> activation of the wakee to the CPU that is about to go idle if the task
> is the only one on the runqueue. This potentially allows the waker task
> to continue making progress when the wakeup is not strictly synchronous.
>
> This is very obvious with netperf UDP_STREAM running on localhost. The
> waker is sending packets as quickly as possible without waiting for any
> reply. It frequently wakes the server for the processing of packets and
> when netserver is using local memory, it quickly completes the processing
> and goes back to idle. The waker often observes that netserver is on_rq
> and spins excessively leading to a drop in throughput.
>
> This is a comparison of 5.7-rc6 against "sched: Optimize ttwu() spinning
> on p->on_cpu" and against this patch labeled vanilla, optttwu-v1r1 and
> localwakelist-v1r2 respectively.
>
> 5.7.0-rc6 5.7.0-rc6 5.7.0-rc6
> vanilla optttwu-v1r1 localwakelist-v1r2
> Hmean send-64 251.49 ( 0.00%) 258.05 * 2.61%* 305.59 * 21.51%*
> Hmean send-128 497.86 ( 0.00%) 519.89 * 4.43%* 600.25 * 20.57%*
> Hmean send-256 944.90 ( 0.00%) 997.45 * 5.56%* 1140.19 * 20.67%*
> Hmean send-1024 3779.03 ( 0.00%) 3859.18 * 2.12%* 4518.19 * 19.56%*
> Hmean send-2048 7030.81 ( 0.00%) 7315.99 * 4.06%* 8683.01 * 23.50%*
> Hmean send-3312 10847.44 ( 0.00%) 11149.43 * 2.78%* 12896.71 * 18.89%*
> Hmean send-4096 13436.19 ( 0.00%) 13614.09 ( 1.32%) 15041.09 * 11.94%*
> Hmean send-8192 22624.49 ( 0.00%) 23265.32 * 2.83%* 24534.96 * 8.44%*
> Hmean send-16384 34441.87 ( 0.00%) 36457.15 * 5.85%* 35986.21 * 4.48%*
>
> Note that this benefit is not universal to all wakeups, it only applies
> to the case where the waker often spins on p->on_rq.
>
Thanks for the detailed description. I think you meant p->on_cpu here not
p->on_rq. I know this patch is included, if possible, please make this edit
here and below.

> /*
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index db3a57675ccf..06297d1142a0 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1688,7 +1688,8 @@ static inline int task_on_rq_migrating(struct task_struct *p)
> */
> #define WF_SYNC 0x01 /* Waker goes to sleep after wakeup */
> #define WF_FORK 0x02 /* Child wakeup after fork */
> -#define WF_MIGRATED 0x4 /* Internal use, task got migrated */
> +#define WF_MIGRATED 0x04 /* Internal use, task got migrated */
> +#define WF_ON_RQ 0x08 /* Wakee is on_rq */
>
should this be WF_ON_CPU? There is a different check for p->on_rq in
try_to_wake_up.

Thanks,
Pavan

--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.