Re: [PATCH v2] sched/eevdf: Prevent vlag from going out of bounds when reweight_eevdf

From: Xuewen Yan
Date: Mon Apr 22 2024 - 09:12:39 EST


Hi peter,

On Mon, Apr 22, 2024 at 7:17 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Mon, Apr 22, 2024 at 07:07:25PM +0800, Xuewen Yan wrote:
> > On Mon, Apr 22, 2024 at 5:42 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > >
> > > On Mon, Apr 22, 2024 at 04:33:37PM +0800, Xuewen Yan wrote:
> > >
> > > > On the Android system, the nice value of a task will change very
> > > > frequently. The limit can also be exceeded.
> > > > Maybe the !on_rq case is still necessary.
> > > > So I'm planning to propose another patch for !on_rq case later after
> > > > careful testing locally.
> > >
> > > So the scaling is: vlag = vlag * old_Weight / weight
> > >
> > > But given that integer devision is truncating, you could expect repeated
> > > application of such scaling would eventually decrease the vlag instead
> > > of grow it.
> > >
> > > Is there perhaps an invocation of reweight_task() missing? Looking at
> >
> > Is it necessary to add reweight_task in the prio_changed_fair()?
>
> I think that's the wrong place. Note how __setscheduler_params() already
> has set_load_weight(). And all other callers of ->prio_changed() already
> seem to do set_load_weight() as well.
>
> But that idle policy thing there still looks wrong, that sets the weight
> very low but doesn't re-adjust anything.

By adding a log to observe weight changes in reweight_entity, I found
that calc_group_shares() often causes new_weight to become very small:

Hardware name: Unisoc UMS-base Board (DT)
Call trace:
dump_backtrace+0xec/0x138
show_stack+0x18/0x24
dump_stack_lvl+0x60/0x84
dump_stack+0x18/0x24
reweight_entity+0x3e8/0x5f4
dequeue_task_fair+0x448/0x948
dequeue_task+0xc4/0x398
deactivate_task+0x1c/0x28
pull_tasks+0x200/0x334
newidle_balance+0x3cc/0x438
pick_next_task_fair+0x58/0x670
__schedule+0x204/0x9a0
schedule+0x128/0x1a8
schedule_timeout+0x44/0x1c8
__skb_wait_for_more_packets+0xd0/0x17c
__unix_dgram_recvmsg+0xdc/0x3a8
unix_seqpacket_recvmsg+0x64/0x74
__sys_recvfrom+0x14c/0x1e4
__arm64_sys_recvfrom+0x24/0x38
invoke_syscall+0x58/0x114
el0_svc_common+0xac/0xe0
do_el0_svc+0x1c/0x28
el0_svc+0x3c/0x70
el0t_64_sync_handler+0x68/0xbc
el0t_64_sync+0x1a8/0x1ac
reweight_entity: the lag=-831088603030 vruntime=2086205903
limit=3071999998 old_weight=237238 new_weight=2