Re: [BUG] Bw accounting warning on fair-servers' parameters change

From: Juri Lelli
Date: Mon Jul 21 2025 - 09:35:50 EST


On 21/07/25 11:34, Yuri Andriaccio wrote:
> Hi,
>
> On 18/07/25 16:53, Juri Lelli wrote:
> > On 18/07/25 16:22, Juri Lelli wrote:
> > > Hi,
> > >
> > > Thanks for reporting.
> > >
> > > On 18/07/25 13:38, Yuri Andriaccio wrote:
> > > > Hi,
> > > >
> > > > I've been lately working on fair-servers and dl_servers for some patches and
> > > > I've come across a bandwidth accounting warning on the latest tip/master (as of
> > > > 2025-07-18, git sha ed0272f0675f). The warning is triggered by simply starting
> > > > the machine, mounting debugfs and then just zeroing any fair-server's runtime.
> > > >
> > > >
> > > > The warning:
> > > >
> > > > WARNING: kernel/sched/deadline.c:266 at dl_rq_change_utilization+0x208/0x230
> > > > static inline void __sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq) {
> > > > ...
> > > > WARN_ON_ONCE(dl_rq->running_bw > dl_rq->this_bw);
> > > > }
> > > >
> > > > Steps to reproduce:
> > > >
> > > > mount -t debugfs none /sys/kernel/debug
> > > > echo 0 > /sys/kernel/debug/sched/fair_server/cpu0/runtime
> > > >
> > > >
> > > > It does not happen at every machine boot, but happens on most. Could it possibly
> > > > be related to some of the deadline timers?
> > >
> > > I took a quick first look and currently suspect cccb45d7c4295
> > > ("sched/deadline: Less agressive dl_server handling") could be playing a
> > > role in this as it delays actual server stop.
> > >
> > > Could you please try to repro after having reverted such commit?
> >
> > After that (w/o the revert), could you please try to see if the
> > following helps?
>
> I've been performing some tests as you asked and indeed the culprit seems to be
> cccb45d7c4295 ("sched/deadline: Less agressive dl_server handling"), as
> reverting it on the current tip removes the issue.
>
> I've also tested the fix you posted (w/o the reverted commit), and I can confirm
> that the warning does not seem to be triggered anymore.

Thanks!

Sent out a clean-up version

https://lore.kernel.org/lkml/20250721-upstream-fix-dlserver-lessaggressive-b4-v1-1-4ebc10c87e40@xxxxxxxxxx/