Re: [PATCH] sched/deadline: Derive root domain from active cpu in task's cpus_ptr
From: Pierre Gondois
Date: Fri Oct 10 2025 - 12:26:41 EST
On 10/6/25 14:12, Juri Lelli wrote:
On 06/10/25 12:13, Pierre Gondois wrote:To share some additional information, I could to reproduce the issue by
On 9/30/25 11:04, Peter Zijlstra wrote:OK, but how much bandwidth is enough for it (on different platforms)?
On Tue, Sep 30, 2025 at 08:20:06AM +0100, Juri Lelli wrote:IIUC, the sugov thread was switched to deadline to allow frequency updates
I actually wonder if we shouldn't make cppc_fie a "special" DEADLINERight, I remember that hack. Bit sad its spreading, but this CPPC thing
tasks (like schedutil [1]). IIUC that is how it is thought to behave
already [2], but, since it's missing the SCHED_FLAG_SUGOV flag(/hack),
it is not "transparent" from a bandwidth tracking point of view.
1 -https://elixir.bootlin.com/linux/v6.17/source/kernel/sched/cpufreq_schedutil.c#L661
2 -https://elixir.bootlin.com/linux/v6.17/source/drivers/cpufreq/cppc_cpufreq.c#L198
is very much like the schedutil one, so might as well do that I suppose.
when deadline tasks start to run. I.e. there should be no point updating the
freq. after the deadline task finished running, cf [1] and [2]
The CPPC FIE worker should not require to run that quickly as it seems to be
more like a freq. maintenance work (the call comes from the sched tick)
sched_tick()
\-arch_scale_freq_tick() / topology_scale_freq_tick()
\-set_freq_scale() / cppc_scale_freq_tick()
\-irq_work_queue()
Also, I am not sure the worker follows cpusets/root domain changes.
creating as many deadline tasks with a huge bandwidth that the platform
allows it:
chrt -d -T 1000000 -P 1000000 0 yes > /dev/null &
Then kexec to another kernel. The available bandwidth of the root domain
gradually decreases with the number of CPUs unplugged.
At some point, there is not enough bandwidth and an overflow is detected.
(Same call stack as in the original message).
So I'm not sure this is really related to the cppc_fie thread.
I think it's more related to checking the available bandwidth in a context
which is not appropriate. The deadline bandwidth might lack when the platform
is reset, but this should not be that important.
---
Question:
Since the cppc_fie worker doesn't have the SCHED_FLAG_SUGOV flag,
is this comment actually correct ?
/*
* Fake (unused) bandwidth; workaround to "fix"
* priority inheritance.
*/
---
On a non-deadline related topic, the CPPC drivers creates a cppc_fie worker in
case the CPPC counters to estimate the current frequency are in PCC channels.
Accessing these channels requires to go through sleeping sections,
that's why a worker is used.
However, CPPC counters might be accessed through FFH, which doesn't go through
sleeping sections. In such case, the cppc_fie worker is never used and never
removed, so it would be nice to remote it.