Re: [PATCH v2 1/8] sched/fair: Update affine statistics when needed

From: Valentin Schneider
Date: Tue May 11 2021 - 07:51:58 EST


On 07/05/21 22:35, Srikar Dronamraju wrote:
> * Valentin Schneider <valentin.schneider@xxxxxxx> [2021-05-07 17:08:17]:
>
>> On 06/05/21 22:15, Srikar Dronamraju wrote:
>> > wake_affine_idle() can return prev_cpu. Even in such a scenario,
>> > scheduler was going ahead and updating schedstats related to wake
>> > affine. i.e even if the task is not moved across LLC domains,
>> > schedstats would have accounted.
>> >
>> > Hence add a check before updating schedstats.
>> >
>
> Thanks Valentin for taking a look at the patch.
>
>>
>> I briefly glanced at the git history but didn't find any proper description
>> of that stat. As it stands, it counts the number of times wake_affine()
>> purposedly steered a task towards a particular CPU (waker or wakee's prev),
>> so nr_wakeups_affine / nr_wakeups_affine_attempts is your wake_affine()
>> "success rate" - how often could it make a choice with the available data.
>>
>> I could see a point in only incrementing the count if wake_affine() steers
>> towards the waker rather than the wakee (i.e. don't increment if choice is
>> prev), but then that has no link with LLC spans
>
> Lets say if prev CPU and this CPU were part of the same LLC, and the prev
> CPU was busy (or busier than this CPU), should consider this as a wake
> affine? If prev was idle, we would have surely consider prev CPU. Also since
> both are part of same LLC, we cant say this CPU is more affine than prev
> CPU. Or may be I am confusing wake_affine with cache_affine.
>

SD_WAKE_AFFINE says: "Consider waking task on waking CPU.", with that I
read wake_affine() as: "should I place the wakee close to the waker or
close to its previous CPU?". This can be yes or no even if both are in the
same LLC.

>>
>> > Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>
>> > Cc: Gautham R Shenoy <ego@xxxxxxxxxxxxxxxxxx>
>> > Cc: Parth Shah <parth@xxxxxxxxxxxxx>
>> > Cc: Ingo Molnar <mingo@xxxxxxxxxx>
>> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
>> > Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
>> > Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
>> > Cc: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
>> > Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>> > Cc: Rik van Riel <riel@xxxxxxxxxxx>
>> > Signed-off-by: Srikar Dronamraju <srikar@xxxxxxxxxxxxxxxxxx>
>> > ---
>> > kernel/sched/fair.c | 6 ++++--
>> > 1 file changed, 4 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> > index 794c2cb945f8..a258a84cfdfd 100644
>> > --- a/kernel/sched/fair.c
>> > +++ b/kernel/sched/fair.c
>> > @@ -5884,8 +5884,10 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p,
>> > if (target == nr_cpumask_bits)
>> > return prev_cpu;
>> >
>> > - schedstat_inc(sd->ttwu_move_affine);
>> > - schedstat_inc(p->se.statistics.nr_wakeups_affine);
>> > + if (!cpus_share_cache(prev_cpu, target)) {
>>
>> Per the above, why? Why not just if(target == this_cpu) ?
>
> We could use target == this_cpu. However if prev CPU and this CPU share the
> same LLC, then should we consider moving to this_cpu as an affine wakeup?
>

It would make sense if it's a sync wakeup, which wake_affine() does try to
do ATM (regardless of LLC actually, if I'm reading it correctly).