Re: [PATCH RT] sched: migrate_enable: Busy loop until the migration request is completed

From: Scott Wood
Date: Wed Jan 22 2020 - 16:13:44 EST


On Fri, 2019-12-13 at 09:14 +0100, Sebastian Andrzej Siewior wrote:
> On 2019-12-13 00:44:22 [-0600], Scott Wood wrote:
> > > @@ -8239,7 +8239,10 @@ void migrate_enable(void)
> > > stop_one_cpu_nowait(task_cpu(p), migration_cpu_stop,
> > > &arg, &work);
> > > __schedule(true);
> > > - WARN_ON_ONCE(!arg.done && !work.disabled);
> > > + if (!work.disabled) {
> > > + while (!arg.done)
> > > + cpu_relax();
> > > + }
> >
> > We should enable preemption while spinning -- besides the general
> > badness
> > of spinning with it disabled, there could be deadlock scenarios if
> > multiple CPUs are spinning in such a loop. Long term maybe have a way
> > to
> > dequeue the no-longer-needed work instead of waiting.
>
> Hmm. My plan was to use per-CPU memory and spin before the request is
> enqueued if the previous isn't done yet (which should not happenâ).

Either it can't happen (and thus no need to spin) or it can, and we need to
worry about deadlocks if we're spinning with preemption disabled. In fact a
deadlock is guaranteed if we're spinning with preemption disabled on the cpu
that's supposed to be running the stopper we're waiting on.

I think you're right that it can't happen though (as long as we queue it
before enabling preemption, the stopper will be runnable and nothing else
can run on the cpu before the queue gets drained), so we can just make it a
warning. I'm testing a patch now.

> Then we could remove __schedule() here and rely on preempt_enable()
> doing that.

We could do that regardless.

-Scott