Re: irq lock inversion

From: Tejun Heo
Date: Fri Nov 06 2009 - 02:47:23 EST

Ingo Molnar wrote:
>>> This warning is bogus -- sched_init() is being called very early with IRQs
>>> disabled, and the irqsave/restore code paths in pcpu_alloc() are only for early
>>> init. The path can never be called from irq context once the early init
>>> finishes. Rationale for this is explained in changelog of the commit mentioned
>>> above.
>>> This problem can be encountered generally in any other early code running
>>> with IRQs off and using irqsave/irqrestore.
>>> Reported-by: Yinghai Lu <yhlu.kernel@xxxxxxxxx>
>>> Signed-off-by: Jiri Kosina <jkosina@xxxxxxx>
>> Looks good to me. Ingo, what do you think?
> Ugh, this explanation is _BOGUS_. As i said, taking a lock with irqs
> disabled does _NOT_ mark a lock as 'irq safe' - if it did, we'd have
> false positives left and right.
> Read the lockdep message please, consider all the backtraces it prints,
> it says something different.

Ah... okay, the pcpu_free() path is correctly marking the lock
irqsafe. I assumed this was caused by recent pcpu_alloc() change.
Sorry about that. The lock inversion problem has always been there,
it just never showed up because none has use allocation map that large
I suppose.

So, the correct fix would be either 1. push down irqsafeness down to
vmalloc locks or 2. the rather ugly unlock-lock dancing in
pcpu_extend_area_map() I posted earlier. For 2.6.32, I guess we'll
have to go with #2. For longer term, we'll probably have to do #1 as
it's required to implement atomic percpu allocations too.

I'll try to reproduce the problem here and verify the previous locking
dance patch.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at