Re: [PATCH V7 01/11] sched: Scheduler time slice extension
From: Prakash Sangappa
Date: Fri Aug 08 2025 - 13:01:11 EST
> On Aug 8, 2025, at 2:59 AM, Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx> wrote:
>
> On 2025-08-07 16:56:33 [+0000], Prakash Sangappa wrote:
>>>>> + if (__rseq_delay_resched()) {
>>>>> + clear_tsk_need_resched(current);
>>>>
>>>> Why has this to be inline and is not done in __rseq_delay_resched()?
>>>
>>> A SCHED_OTHER wake up sets _TIF_NEED_RESCHED_LAZY so
>>> clear_tsk_need_resched() will revoke this granting an extension.
>>>
>>> The RT/DL wake up will set _TIF_NEED_RESCHED and
>>> clear_tsk_need_resched() will also clear it. However this one
>>> additionally sets set_preempt_need_resched() so the next preempt
>>> disable/ enable combo will lead to a scheduling event. A remote wakeup
>>> will trigger an IPI (scheduler_ipi()) which also does
>>> set_preempt_need_resched().
>>>
>>> If I understand this correct then a RT/DL wake up while the task is in
>>> kernel-mode should lead to a scheduling event assuming we pass a
>>> spinlock_t (ignoring the irq argument).
>>> Should the task be in user-mode then we return to user mode with the TIF
>>> flag cleared and the NEED-RESCHED flag folded into the preemption
>>> counter.
>>>
>>> I am once again asking to limit this to _TIF_NEED_RESCHED_LAZY.
>>
>> Would the proposal(patches 7-11) to have an API/Mechanism, as Thomas suggested,
>> for RT threads to indicate not to be delayed address the concern?.
>> Also there is the proposal to have a kernel parameter to disable delaying
>> RT threads in general, when granting extra time to the running task.
>
> While I appreciate the effort I don't see the need for this
> functionality atm. I would say just get the basic infrastructure
> focusing on LAZY preempt and ignore the wakes for tasks with elevated
> priority. If this works reliably and people indeed ask for delayed
> wakes for RT threads then this can be added assuming you have enough
> flexibility in the API to allow it. Then you would also have a use-case
> on how to implement it.
>
> Looking at 07/11, you set a task_sched::sched_nodelay if this is
> requested. In 09/11 you set TIF_NEED_RESCHED_NODELAY if that flag is
> set. In 08/11 you use that flag additionally for wake ups and propagate
> it for the architecture. Puh.
> If a task needs to set this flag first in order to be excluded from the
> delayed wake ups then I don't see how this can work for kernel threads
> such as the threaded interrupts or a user thread which is PI-boosted and
> inherits the RT priority.
>
> On the other hand lets assume you check and clear only
> TIF_NEED_RESCHED_LAZY. Lets say people ask to extend the delayed wakes
> to certain userland RT threads. Then you could add a prctl() to turn
> TIF_NEED_RESCHED into TIF_NEED_RESCHED_LAZY for the "marked" threads.
> Saying I don't mind if this particular thread gets delayed.
> If this is needed for all threads in system you could do a system wide
> sysctl and so on.
> You would get all this without another TIF bit and tracing would keep
> showing reliably a N or L flag.
Ok, Will drop these patches next round.
Should we just consider adding a sysctl to to choose if we want to delay if
TIF_NEED_RESCHED Is set?
-Prakash
>
>> Thanks,
>> -Prakash
>>
> Sebastian