Re: Soft lockup during suspend since ~2.6.36

From: Thilo-Alexander Ginkel
Date: Mon Apr 04 2011 - 11:03:29 EST


On Mon, Apr 4, 2011 at 16:40, Arnd Bergmann <arnd@xxxxxxxx> wrote:
> On Monday 04 April 2011, Thilo-Alexander Ginkel wrote:
>> The result is available in these pictures:
>> Â https://secure.tgbyte.de/dropbox/IeZalo4t-1.jpg
>> Â https://secure.tgbyte.de/dropbox/IeZalo4t-2.jpg
>>
>> For both traces, the printed error message reads: "BUG: soft lockup -
>> CPU#3 stuck for 67s! [kblockd:28]"
>>
>> (After a bit of Googling I understand that a soft lockup is probably
>> different from a deadlock - please correct me if that assumption is
>> wrong)
>
> My interpretation is that some process tries to use
> kblockd_schedule_work() after the CPU for that workqueue has been
> disabled. The work queue functions (worker_maybe_bind_and_lock)
> is waiting for the CPU to become available, which it doesn't do.

Thanks for your help so far!

Is there a way to figure out which process that may be?

> You see different outputs every time the softlockup detection finds
> this because the loop is in different states here. The reason why
> the spin_unlock shows up here is because that is when the interrupts
> get enabled and the softlockup detection notices the timeout.

OK, that make sense.

> I'm pretty sure that this has nothing to do with the bisected bug
> that you initially found, but maybe somebody else can try analysing
> this better.

ACK. I see two possibilities:
a) The bug was introduced after the bisected bug was fixed
b) The bug was already present earlier, but was masked by the bug from
the bisected change

I hope for a) as that would open the possibility to bisect this new bug.

Regards,
Thilo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/