Re: [Bugme-new] [Bug 11543] New: kernel panic: softlockup intick_periodic() ???

From: j_kernel
Date: Thu Sep 11 2008 - 22:55:20 EST


On Thu, Sep 11, 2008 at 05:02:58PM -0700, Andrew Morton wrote:
> Is this a regression? Was 2.6.26 OK, for example?

It might be a regression. ;) The last build we were running on this
hardware was 2.6.24.2 and NMI watchdog support was not enabled. We were
however experiencing random deadlocks, which I had been attributing to
problems with forcedeth.c (which causes the NIC to totally crap out
but not deadlock the machine) but I am now of the mind that there are
multiple problems with distinct failure modes.

> I can't work out who called panic(), nor why.

One more data point. We booted this kernel on 14 machines this morning
and only one has had this panic thus far...

> The panic code called the kexec code which called mutex_trylock() which
> called spin_lock_mutex() which then stupidly went and blurted a load of
> debug stuff because of in_interrupt().
>
> Something like this:
>
> --- a/include/linux/debug_locks.h~a
> +++ a/include/linux/debug_locks.h
> @@ -17,7 +17,7 @@ extern int debug_locks_off(void);
> ({ \
> int __ret = 0; \
> \
> - if (unlikely(c)) { \
> + if (!oops_in_progress && unlikely(c)) { \
> if (debug_locks_off() && !debug_locks_silent) \
> WARN_ON(1); \
> __ret = 1; \
> _
>
> might prevent the debugging code from preventing us from finding bugs :(

Do you want me to give that patch a try or sit tight for a bit?

-J

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/