Re: [ANNOUNCE] 3.0.1-rt11

From: Thomas Gleixner
Date: Wed Sep 07 2011 - 12:32:51 EST


On Wed, 7 Sep 2011, Russell King - ARM Linux wrote:

> On Wed, Sep 07, 2011 at 12:57:44PM +0200, Thomas Gleixner wrote:
> > The problem is that if you enable interrupts on the CPU _BEFORE_ it is
> > set online AND active, then you can end up waking up kernel threads
> > which are bound to that CPU and the scheduler will happily schedule
> > them on an online CPU. That makes them lose the cpu affinity to the
> > CPU as well and hell breaks lose.
>
> How can that happen?
>
> 1. The only interrupts we're likely to receive are the local timer
> interrupts - we have not routed any other interrupts to this CPU.

Fair enough, on x86 this can happen when we enable interrupts.

> 2. We will not schedule on this CPU except at explicit scheduling
> points (such as contended mutexes or explicit calls to schedule)
> as we have a call to preempt_disable().

Right, you don't schedule. But a wakeup of a thread which has its
affinity set to the new online CPU runs (as Frank pointed out)
through:

wake_up_process()
try_to_wake_up()
select_task_rq()
if (... || !cpu_online(cpu))
select_fallback_rq(task_cpu(p), p)
...
/* No more Mr. Nice Guy. */
dest_cpu = cpuset_cpus_allowed_fallback(p)
do_set_cpus_allowed(p, cpu_possible_mask)
# Thus ksoftirqd can now run on any cpu...

So the problem is not scheduling, it's the wakeup code. Sorry for
being imprecise.

We can't do anything about it in the scheduler code, so we have to
make sure that the cpu startup code enables interrupts after the
online AND active bits have been set.

> > Frank has observed this with softirq threads, but the same thing is
> > true for any other CPU bound thread like the worker stuff.
>
> So who is scheduling a workqueue from the local timer?

The problem are timer callbacks which might be executed in the softirq
code on return from interrupt. We had one case observed on x86 where
an expired timer was queued on the about to go online cpu and the
callback scheduled work on that CPU which then caused the cpu affine
worker thread to move away :(

> > So moving the online, active thing BEFORE enabling interrupt is the
> > only sensible solution.
>
> Yes, that'll be why even x86 enables interrupts before setting the CPU
> online for the delay calibration.

Correct.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/