Re: start_kernel(): bug: interrupts were enabled early

From: Andrew Morton
Date: Wed Mar 31 2010 - 17:29:33 EST


On Wed, 31 Mar 2010 14:12:54 -0700
"H. Peter Anvin" <hpa@xxxxxxxxx> wrote:

> On 03/31/2010 01:52 PM, Andrew Morton wrote:
> > On Wed, 31 Mar 2010 13:47:23 -0700
> > Yinghai Lu <yinghai@xxxxxxxxxx> wrote:
> >
> >> spin_unlock_irq from arm is different from other archs?
> >
> > No, spin_unlock_irq() unconditionally enables interrupts on all
> > architectures.
>
> So I found checkin 60ba96e546da45d9e22bb04b84971a25684e4d46 in the
> bk-historic git tree:
>
> [PATCH] rwsem: Make rwsems use interrupt disabling spinlocks
>
> The attached patch makes read/write semaphores use interrupt disabling
> spinlocks in the slow path, thus rendering the up functions and trylock
> functions available for use in interrupt context. This matches the
> regular semaphore behaviour.
>
> I've assumed that the normal down functions must be called with
> interrupts enabled (since they might schedule), and used the
> irq-disabling spinlock variants that don't save the flags.
>
> Signed-Off-By: David Howells <dhowells@xxxxxxxxxx>
> Tested-by: Badari Pulavarty <pbadari@xxxxxxxxxx>
> Signed-off-by: Linus Torvalds <torvalds@xxxxxxxx>
>
> What we have here is a case of this assumption being violated, because
> the lock is taken with interrupts disabled on a path where contention
> cannot happen (because the code is single-threaded at this point), but
> the lock is taken due to reuse of generic code.
>
> The obvious way to fix this would be to use
> spin_lock_irqsave..spin_lock_irqrestore in __down_read as well as in the
> other locations; I don't have a good feel for what the cost of doing so
> would be, though. On x86 it's fairly expensive simply because the only
> way to save the state is to push it on the stack, which the compiler
> doesn't deal well with, but this code isn't used on x86.
>

Well, it's all a bit nasty. kmem_cache_create() does a lot of stuff,
including calling into the page allocator with GFP_KERNEL - expecting
kmem_cache_create() to preserve local_irq_disable() is a bit optimistic.

radix_tree_init() calls hotcpu_notifier() which also does
mutex_lock(&cpu_add_remove_lock);

The easiest fix is to reposition the interrutps-are-now-enabled point
in start_kernel(). But I have a feeling that some versions of
early_irq_init() won't like that.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/