Re: [PATCH v4 3/3] sched: optimize migration by forcing rmb() and updating to be called once

From: Peter Zijlstra
Date: Tue Nov 10 2015 - 07:17:08 EST


On Tue, Nov 10, 2015 at 10:09:05AM +0900, Byungchul Park wrote:
> On Mon, Nov 09, 2015 at 02:29:14PM +0100, Peter Zijlstra wrote:
> > On Sat, Oct 24, 2015 at 01:16:21AM +0900, byungchul.park@xxxxxxx wrote:
> > > +++ b/kernel/sched/core.c
> > > @@ -1264,6 +1264,8 @@ EXPORT_SYMBOL_GPL(set_cpus_allowed_ptr);
> > >
> > > void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
> > > {
> > > + unsigned int prev_cpu = task_cpu(p);
> > > +
> > > #ifdef CONFIG_SCHED_DEBUG
> > > /*
> > > * We should never call set_task_cpu() on a blocked task,
> > > @@ -1289,15 +1291,14 @@ void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
> > > #endif
> > >
> > > trace_sched_migrate_task(p, new_cpu);
> > > + __set_task_cpu(p, new_cpu);
> > >
> > > - if (task_cpu(p) != new_cpu) {
> > > + if (prev_cpu != new_cpu) {
> > > if (p->sched_class->migrate_task_rq)
> > > - p->sched_class->migrate_task_rq(p, new_cpu);
> > > + p->sched_class->migrate_task_rq(p, prev_cpu);
> > > p->se.nr_migrations++;
> > > perf_event_task_migrate(p);
> > > }
> > > -
> > > - __set_task_cpu(p, new_cpu);
> > > }
> >
> > I don't think this is safe, see the comment in __set_task_cpu(). We want
> > that to be last.
>
> I am sorry but I don't understand what you said. I checked the comment in
> __set_task_cpu().
>
> /*
> * After ->cpu is set up to a new value, task_rq_lock(p, ...) can be
> * successfuly executed on another CPU. We must ensure that updates of
> * per-task data have been completed by this moment.
> */
>
> Of course, ->cpu should be set up to a new value for task_rq_lock() to be
> executed successfully on another CPU. Is this the case? Is there something
> i missed? I think it would be ok if task->pi_lock can work correctly within
> "if" statement in set_task_cpu(). Is there problem to do that?

So the problem is that as soon as that ->cpu store comes through, the
other rq->lock can happen, even though we might still hold a rq->lock
thinking we're serialized.

Take for instance move_queued_tasks(), it does:

dequeue_task(rq, p, 0);
p->on_rq = TASK_ON_RQ_MIGRATING;
set_task_cpu(p, new_cpu) {
__set_task_cpu();

^^^ here holding rq->lock is insufficient and the below:

p->sched_class->migrate_task_rq()

would no longer be serialized by rq->lock.

}
raw_spin_unlock(&rq->lock);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/