Re: [RFC][PATCH 5/5] sched: Add ttwu_queue support for delayed tasks
From: Peter Zijlstra
Date: Fri Jun 13 2025 - 06:47:04 EST
On Fri, Jun 13, 2025 at 11:51:19AM +0200, Peter Zijlstra wrote:
> On Fri, Jun 13, 2025 at 09:34:22AM +0200, Dietmar Eggemann wrote:
> > On 20/05/2025 11:45, Peter Zijlstra wrote:
> >
> > [...]
> >
> > > @@ -3830,12 +3859,41 @@ void sched_ttwu_pending(void *arg)
> > > update_rq_clock(rq);
> > >
> > > llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
> > > + struct rq *p_rq = task_rq(p);
> > > + int ret;
> > > +
> > > + /*
> > > + * This is the ttwu_runnable() case. Notably it is possible for
> > > + * on-rq entities to get migrated -- even sched_delayed ones.
> > > + */
> > > + if (unlikely(p_rq != rq)) {
> > > + rq_unlock(rq, &rf);
> > > + p_rq = __task_rq_lock(p, &rf);
> >
> > I always get this fairly early with TTWU_QUEUE_DELAYED enabled, related
> > to p->pi_lock not held in wakeup from interrupt.
> >
> > [ 36.175285] WARNING: CPU: 0 PID: 162 at kernel/sched/core.c:679 __task_rq_lock+0xf8/0x128
>
> Thanks, let me go have a look.
I'm thinking this should cure things.
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -677,7 +677,12 @@ struct rq *__task_rq_lock(struct task_st
{
struct rq *rq;
- lockdep_assert_held(&p->pi_lock);
+ /*
+ * TASK_WAKING is used to serialize the remote end of wakeup, rather
+ * than p->pi_lock.
+ */
+ lockdep_assert(p->__state == TASK_WAKING ||
+ lockdep_is_held(&p->pi_lock) != LOCK_STATE_NOT_HELD);
for (;;) {
rq = task_rq(p);