Re: [PATCH] sched/ext: Suppress warning in __this_cpu_write() by disabling preemption

From: Peter Zijlstra
Date: Wed Jul 16 2025 - 09:33:41 EST


On Wed, Jul 16, 2025 at 03:20:33PM +0200, Andrea Righi wrote:
> On Wed, Jul 16, 2025 at 03:06:52PM +0200, Peter Zijlstra wrote:
> > On Wed, Jul 16, 2025 at 08:54:47AM -0400, Steven Rostedt wrote:
> > > On Wed, 16 Jul 2025 05:46:15 -0700
> > > Breno Leitao <leitao@xxxxxxxxxx> wrote:
> > >
> > > > __this_cpu_write() emits a warning if used with preemption enabled.
> > > >
> > > > Function update_locked_rq() might be called with preemption enabled,
> > > > which causes the following warning:
> > > >
> > > > BUG: using __this_cpu_write() in preemptible [00000000] code: scx_layered_6-9/68770
> > > >
> > > > Disable preemption around the __this_cpu_write() call in
> > > > update_locked_rq() to suppress the warning, without affecting behavior.
> > > >
> > > > If preemption triggers a jump to another CPU during the callback it's
> > > > fine, since we would track the rq state on the other CPU with its own
> > > > local variable.
> > > >
> > > > Suggested-by: Andrea Righi <arighi@xxxxxxxxxx>
> > > > Signed-off-by: Breno Leitao <leitao@xxxxxxxxxx>
> > > > Fixes: 18853ba782bef ("sched_ext: Track currently locked rq")
> > > > Acked-by: Andrea Righi <arighi@xxxxxxxxxx>
> > > > ---
> > > > kernel/sched/ext.c | 7 +++++++
> > > > 1 file changed, 7 insertions(+)
> > > >
> > > > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> > > > index b498d867ba210..24fcbd7331f73 100644
> > > > --- a/kernel/sched/ext.c
> > > > +++ b/kernel/sched/ext.c
> > > > @@ -1258,7 +1258,14 @@ static inline void update_locked_rq(struct rq *rq)
> > > > */
> > > > if (rq)
> > > > lockdep_assert_rq_held(rq);
> > >
> > > <blink>
> > >
> > > If an rq lock is expected to be held, there had better be no preemption
> > > enabled. How is this OK?
> >
> > The rq=NULL case; but from the usage I've seen that also happens with
> > rq lock held.
> >
> > Specifically I think the check ought to be:
> >
> > if (rq)
> > lockdep_assert_rq_held(rq)
> > else
> > lockdep_assert_rq_held(__this_cpu_read(locked_rq));
>
> Hm... but if the same CPU invokes two "unlocked" callbacks in a row,
> locked_rq would be NULL during the second call and we would check rq_held
> against NULL.

Current usage in SCX_CALL_OP*() seems to not generate this pattern. It
is always rq,NULL in order.

Ooh, there are a few SCX_CALL_OP*() instances where rq:=NULL, which
messes this up.

Changing them to:

if (rq)
update_locked_rq(rq);
...
if (rq)
update_locked_rq(NULL);

might help.