Re: Debugging Thinkpad T430s occasional suspend failure.

From: Linus Torvalds
Date: Fri Feb 15 2013 - 13:34:39 EST

On Fri, Feb 15, 2013 at 9:44 AM, Paul E. McKenney
<paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> This commit was designed to increase the probability of hitting the
> races described in These races result
> in deadlocks involving the runqueue lock (and perhaps also the priority
> inheritance locks). And yes, I most certainly should have described
> this in the commit message. :-(

Ugh. That particular race seems to be because the softirq handling is
just crazy, and does the "wakeup_softirqd()" form interrupt context,

Because it claims to do it from softirq context, which is pure
garbage. It's not actually in softirq context.

The whole hardirq -> softirq transition seems stupid. I'm sure I made
some serious mistake in cleaning it up, and there's probably some
missed tracepoint (or perhaps screwed-up lockdep annotation), but I
think the hardirq -> softirq preempt thing shoudl be done as an atomic
preempt downgrade, so that we never have a window of "uhhuh, another
interrupt can come in between and see us as being in neither). And the
wakeup_softirqd should be done without playing with preempt count at

Something like this ENTIRELY UNTESTED patch.

Note: I doubt this patch affects Dave's issue at all, I just started
looking at that do_softirq code when I read your bug explanation.

Adding random people for kernel/softirq.c to the participants list.
Comments about the patch? Do note that it's entirely untested, so
consider it more a RFD than a real patch.. It looks like it adds a lot
of lines, but most of it is for comments and simplification of the

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at