Re: [PATCH] sched/fair: do not scan twice in detach_tasks()
From: Valentin Schneider
Date: Thu Jul 17 2025 - 05:50:10 EST
On 17/07/25 10:56, Shijie Huang wrote:
> On 2025/7/16 23:08, Valentin Schneider wrote:
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index b9b4bbbf0af6f..32ae24aa377ca 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -11687,7 +11687,7 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
>> * still unbalanced. ld_moved simply stays zero, so it is
>> * correctly treated as an imbalance.
>> */
>> - env.loop_max = min(sysctl_sched_nr_migrate, busiest->nr_running);
>> + env.loop_max = min(sysctl_sched_nr_migrate, busiest->cfs.h_nr_queued);
>
> I tested this patch, it did not work. I still can catch lots of
> occurrences of this issue in Specjbb test.
>
>
> IMHO, the root cause of this issue is env.loop_max is set out of the
> rq's lock.
>
> Even we set env.loop_max to busiest->cfs.h_nr_queued, the real tasks
> length still can shrink in
>
> other places.
>
Ah right, and updating the max in detach_tasks() itself isn't a complete
solution if we re-enter it due to LBF_NEED_BREAK. Nevermind then :-)