Re: [BUG] "sched: Remove rq->lock from the first half of ttwu()"locks up on ARM

From: Ingo Molnar
Date: Fri May 27 2011 - 08:06:53 EST

Next message: Wu, Josh: "RE: [PATCH] [media] at91: add Atmel Image Sensor Interface (ISI)support"
Previous message: richard -rw- weinberger: "Re: [GIT pull] x86 vdso updates"
In reply to: Catalin Marinas: "Re: [BUG] "sched: Remove rq->lock from the first half of ttwu()"locks up on ARM"
Next in thread: Russell King - ARM Linux: "Re: [BUG] "sched: Remove rq->lock from the first half of ttwu()"locks up on ARM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

* Catalin Marinas <catalin.marinas@xxxxxxx> wrote:

> > How much time does that take on contemporary ARM hardware,
> > typically (and worst-case)?
>
> On newer ARMv6 and ARMv7 hardware, we no longer flush the caches at
> context switch as we got VIPT (or PIPT-like) caches.
>
> But modern ARM processors use something called ASID to tag the TLB
> entries and we are limited to 256. The switch_mm() code checks for
> whether we ran out of them to restart the counting. This ASID
> roll-over event needs to be broadcast to the other CPUs and issuing
> IPIs with the IRQs disabled isn't always safe. Of course, we could
> briefly re-enable them at the ASID roll-over time but I'm not sure
> what the expectations of the code calling switch_mm() are.

The expectations are to have irqs off (we are holding the runqueue
lock if !__ARCH_WANT_INTERRUPTS_ON_CTXSW), so that's not workable i
suspect.

But in theory we could drop the rq lock and restart the scheduler
task-pick and balancing sequence when the ARM TLB tag rolls over. So
instead of this fragile and assymetric method we'd have a
straightforward retry-in-rare-cases method.

That means some modifications to switch_mm() but should be solvable.

That would make ARM special only in so far that it's one of the few
architectures that signal 'retry task pickup' via switch_mm() - it
would use the stock scheduler otherwise and we could remove
__ARCH_WANT_INTERRUPTS_ON_CTXSW and perhaps even
__ARCH_WANT_UNLOCKED_CTXSW altogether.

I'd suggest doing this once modern ARM chips get so widespread that
you can realistically induce a ~700 usecs irqs-off delays on old,
virtual-cache ARM chips. Old chips would likely use old kernels
anyway, right?

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Wu, Josh: "RE: [PATCH] [media] at91: add Atmel Image Sensor Interface (ISI)support"
Previous message: richard -rw- weinberger: "Re: [GIT pull] x86 vdso updates"
In reply to: Catalin Marinas: "Re: [BUG] "sched: Remove rq->lock from the first half of ttwu()"locks up on ARM"
Next in thread: Russell King - ARM Linux: "Re: [BUG] "sched: Remove rq->lock from the first half of ttwu()"locks up on ARM"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]