Re: [PATCH V7 01/11] sched: Scheduler time slice extension

From: Sebastian Andrzej Siewior
Date: Fri Aug 08 2025 - 06:00:32 EST


On 2025-08-07 16:56:33 [+0000], Prakash Sangappa wrote:
> >>> + if (__rseq_delay_resched()) {
> >>> + clear_tsk_need_resched(current);
> >>
> >> Why has this to be inline and is not done in __rseq_delay_resched()?
> >
> > A SCHED_OTHER wake up sets _TIF_NEED_RESCHED_LAZY so
> > clear_tsk_need_resched() will revoke this granting an extension.
> >
> > The RT/DL wake up will set _TIF_NEED_RESCHED and
> > clear_tsk_need_resched() will also clear it. However this one
> > additionally sets set_preempt_need_resched() so the next preempt
> > disable/ enable combo will lead to a scheduling event. A remote wakeup
> > will trigger an IPI (scheduler_ipi()) which also does
> > set_preempt_need_resched().
> >
> > If I understand this correct then a RT/DL wake up while the task is in
> > kernel-mode should lead to a scheduling event assuming we pass a
> > spinlock_t (ignoring the irq argument).
> > Should the task be in user-mode then we return to user mode with the TIF
> > flag cleared and the NEED-RESCHED flag folded into the preemption
> > counter.
> >
> > I am once again asking to limit this to _TIF_NEED_RESCHED_LAZY.
>
> Would the proposal(patches 7-11) to have an API/Mechanism, as Thomas suggested,
> for RT threads to indicate not to be delayed address the concern?.
> Also there is the proposal to have a kernel parameter to disable delaying
> RT threads in general, when granting extra time to the running task.

While I appreciate the effort I don't see the need for this
functionality atm. I would say just get the basic infrastructure
focusing on LAZY preempt and ignore the wakes for tasks with elevated
priority. If this works reliably and people indeed ask for delayed
wakes for RT threads then this can be added assuming you have enough
flexibility in the API to allow it. Then you would also have a use-case
on how to implement it.

Looking at 07/11, you set a task_sched::sched_nodelay if this is
requested. In 09/11 you set TIF_NEED_RESCHED_NODELAY if that flag is
set. In 08/11 you use that flag additionally for wake ups and propagate
it for the architecture. Puh.
If a task needs to set this flag first in order to be excluded from the
delayed wake ups then I don't see how this can work for kernel threads
such as the threaded interrupts or a user thread which is PI-boosted and
inherits the RT priority.

On the other hand lets assume you check and clear only
TIF_NEED_RESCHED_LAZY. Lets say people ask to extend the delayed wakes
to certain userland RT threads. Then you could add a prctl() to turn
TIF_NEED_RESCHED into TIF_NEED_RESCHED_LAZY for the "marked" threads.
Saying I don't mind if this particular thread gets delayed.
If this is needed for all threads in system you could do a system wide
sysctl and so on.
You would get all this without another TIF bit and tracing would keep
showing reliably a N or L flag.

> Thanks,
> -Prakash
>
Sebastian