Re: [PATCH] sched,tracing: Correct trace_sched_pi_setprio() for deboosting

From: Sebastian Andrzej Siewior
Date: Thu May 24 2018 - 02:51:25 EST


On 2018-05-23 19:28:19 [+0200], Peter Zijlstra wrote:
> On Wed, May 23, 2018 at 04:11:07PM +0200, Sebastian Andrzej Siewior wrote:
>
> > Since that commit I see during a deboost a task this:
> > |futex sched_pi_setprio: comm=futex_requeue_p pid=2234 oldprio=98 newprio=98
> > |futex sched_switch: prev_comm=futex_requeue_p prev_pid=2234 prev_prio=120
> >
> > and after the revert, the `newprio' shows the correct value again:
> >
> > |futex sched_pi_setprio: comm=futex_requeue_p pid=2220 oldprio=98 newprio=120
> > |futex sched_switch: prev_comm=futex_requeue_p prev_pid=2220 prev_prio=120
>
> > @@ -435,7 +435,7 @@ TRACE_EVENT(sched_pi_setprio,
> > memcpy(__entry->comm, tsk->comm, TASK_COMM_LEN);
> > __entry->pid = tsk->pid;
> > __entry->oldprio = tsk->prio;
> > - __entry->newprio = pi_task ? pi_task->prio : tsk->prio;
> > + __entry->newprio = new_prio;
> > /* XXX SCHED_DEADLINE bits missing */
> > ),
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index 092f7c4de903..888df643b99b 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3823,7 +3823,7 @@ void rt_mutex_setprio(struct task_struct *p, struct task_struct *pi_task)
> > goto out_unlock;
> > }
> >
> > - trace_sched_pi_setprio(p, pi_task);
> > + trace_sched_pi_setprio(p, prio);
>
> at this point:
>
> prio = pi_task ? min(p->normal_prio, pi->task->prio) : p->normal_prio;
>
> (aka __rt_effective_prio)
>
> Should we put that in the tracepoint instead?

I don't see the point in open coding __rt_effective_prio() and
recomputing the value we already have. I'm a little worried that if
something happens to `prio' we might miss it and notice later while
debugging.
However, if they are reasons like breaking the trace-API for $tools, I
can update it.

Sebastian