Re: [2.6.27] early exception - lockdep related?

From: Peter Zijlstra
Date: Wed Sep 03 2008 - 02:52:19 EST


On Tue, 2008-09-02 at 23:06 +0200, Luca Tettamanti wrote:
> Hello,
> I'm seeing an early exception (0e) - which seems related to lockdep - at
> boot with many 2.6.27 kernels and I'm having troubles to track it down.

Does it print one of these nice stack traces?

> The strange thing is that it comes and goes with different kernel
> versions

What kernel version did you generate the below with - and what config.

> , but a "bad" kernel consistently fails across reboots. It also
> seems to be sensitive to the configuration (attached)

you seem to have forgotten that attachment.

> , at least in one
> case the difference between a non-working kernel and a working one is
> CONFIG_DEBUG enabled in the latter.
>
> The address printed is inside the function __lock_acqurie:
>
> in __lock_acquire (/home/kronos/src/linux-2.6.git/kernel/lockdep.c:727).
> 722
> 723 /*
> 724 * We can walk the hash lockfree, because the hash only
> 725 * grows, and we are careful when adding entries to the end:
> 726 */
> 727 list_for_each_entry(class, hash_head, hash_entry) {
> 728 if (class->key == key) {
> 729 WARN_ON_ONCE(class->name != lock->name);
> 730 return class;
> 731 }

Right - except this isn't in __lock_acquire, its from
look_up_lock_class, which probably gets inlined.

> And the disassembly (faulting address is 0xffffffff80253b66)

<snip asm>

> I actually tried to bisect it using an early working kernel and this first
> (read: the first pull from Linus' tree that gave me a broken kernel). The
> result is inconclusive, the commit pointed out was:
>
> e5f363e358cf16e4ad13a6826e15088c5495efe9 is first bad commit
> commit e5f363e358cf16e4ad13a6826e15088c5495efe9
> Author: Ingo Molnar <mingo@xxxxxxx>
> Date: Mon Aug 11 12:37:27 2008 +0200
>
> lockdep: increase MAX_LOCKDEP_KEYS
>
> certain configs produce:
>
> [ 70.076229] BUG: MAX_LOCKDEP_KEYS too low!
> [ 70.080230] turning off the locking correctness validator.
>
> tune them up.
>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>
> but reverting it didn't make any difference.
>
> Is there any other action that I can take to debug this issue? Btw, do you
> think that kgdb can be usefull? I can give it a try in the weekend...

Might - I've never tried it..

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/