Re: Crashes during boot on 2.6.30 / 2.6.31-rc, random programs

From: Joao Correia
Date: Sat Jun 27 2009 - 11:12:15 EST


Forgot config. Here it goes.

Thank you for your time,
Joao Correia
Centro de Informatica
Universidade da Beira Interior
Portugal

On Sat, Jun 27, 2009 at 4:07 PM, Joao
Correia<joaomiguelcorreia@xxxxxxxxx> wrote:
> Hello
>
> Since midway through the 2.6.30 series, and continuing on the rc1 for
> .31, im having this at every boot, during the boot process:
>
> (...)
> Jun 26 21:07:03 hightech kernel: NET: Registered protocol family 10
> Jun 26 21:07:03 hightech kernel: lo: Disabled Privacy Extensions
> Jun 26 21:07:03 hightech kernel: BUG: MAX_STACK_TRACE_ENTRIES too low!
> Jun 26 21:07:03 hightech kernel: turning off the locking correctness validator.
> Jun 26 21:07:03 hightech kernel: Pid: 672, comm: modprobe Not tainted
> 2.6.31-rc1 #1
> Jun 26 21:07:03 hightech kernel: Call Trace:
> Jun 26 21:07:03 hightech kernel: [<c080facf>] ? printk+0x22/0x3b
> Jun 26 21:07:03 hightech kernel: [<c046c940>] save_trace+0x89/0xa3
> Jun 26 21:07:03 hightech kernel: [<c046c9ca>] add_lock_to_list+0x70/0xb3
> Jun 26 21:07:03 hightech kernel: [<c046eb81>] __lock_acquire+0x996/0xb08
> Jun 26 21:07:03 hightech kernel: [<c066d7c2>] ? account+0x30/0xef
> Jun 26 21:07:03 hightech kernel: [<c066d7c2>] ? account+0x30/0xef
> Jun 26 21:07:03 hightech kernel: [<c046edaa>] lock_acquire+0xb7/0xeb
> Jun 26 21:07:03 hightech kernel: [<c066d7c2>] ? account+0x30/0xef
> Jun 26 21:07:03 hightech kernel: [<c066d7c2>] ? account+0x30/0xef
> Jun 26 21:07:03 hightech kernel: [<c081283f>] _spin_lock_irqsave+0x45/0x89
> Jun 26 21:07:03 hightech kernel: [<c066d7c2>] ? account+0x30/0xef
> Jun 26 21:07:03 hightech kernel: [<c066d7c2>] account+0x30/0xef
> Jun 26 21:07:03 hightech kernel: [<c066d995>] extract_entropy+0x3e/0xad
> Jun 26 21:07:03 hightech kernel: [<c046c51a>] ? register_lock_class+0x25/0x2b0
> Jun 26 21:07:03 hightech kernel: [<c066dab2>] xfer_secondary_pool+0xae/0xeb
> Jun 26 21:07:03 hightech kernel: [<c046c51a>] ? register_lock_class+0x25/0x2b0
> Jun 26 21:07:03 hightech kernel: [<c046d317>] ? mark_lock+0x29/0x1f3
> Jun 26 21:07:03 hightech kernel: [<c046e4ed>] ? __lock_acquire+0x302/0xb08
> Jun 26 21:07:03 hightech kernel: [<c066d988>] extract_entropy+0x31/0xad
> Jun 26 21:07:03 hightech kernel: [<faa90290>] ?
> ipv6_regen_rndid+0x2c/0xca [ipv6]
> Jun 26 21:07:03 hightech kernel: [<faa90290>] ?
> ipv6_regen_rndid+0x2c/0xca [ipv6]
> Jun 26 21:07:03 hightech kernel: [<c066db18>] get_random_bytes+0x29/0x3e
> Jun 26 21:07:03 hightech kernel: [<faa8e240>]
> __ipv6_regen_rndid+0x2c/0xdc [ipv6]
> Jun 26 21:07:03 hightech kernel: [<faa902a0>] ipv6_regen_rndid+0x3c/0xca [ipv6]
> Jun 26 21:07:03 hightech kernel: [<faa90efa>] ipv6_add_dev+0x236/0x2d8 [ipv6]
> Jun 26 21:07:03 hightech kernel: [<faa91e67>] addrconf_notify+0x5c/0x750 [ipv6]
> Jun 26 21:07:03 hightech kernel: [<c046d317>] ? mark_lock+0x29/0x1f3
> Jun 26 21:07:03 hightech kernel: [<c046d533>] ? mark_held_locks+0x52/0x7c
> Jun 26 21:07:03 hightech kernel: [<c08111ce>] ? __mutex_lock_common+0x2ca/0x32b
> Jun 26 21:07:03 hightech kernel: [<c046d800>] ?
> trace_hardirqs_on_caller+0x122/0x155
> Jun 26 21:07:03 hightech kernel: [<c08111df>] ? __mutex_lock_common+0x2db/0x32b
> Jun 26 21:07:03 hightech kernel: [<c046d800>] ?
> trace_hardirqs_on_caller+0x122/0x155
> Jun 26 21:07:03 hightech kernel: [<c077d86e>] ? rtnl_lock+0x22/0x35
> Jun 26 21:07:03 hightech kernel: [<faacb000>] ? inet6_init+0x0/0x2bf [ipv6]
> Jun 26 21:07:03 hightech kernel: [<c0811322>] ? mutex_lock_nested+0x41/0x5a
> Jun 26 21:07:03 hightech kernel: [<c077d86e>] ? rtnl_lock+0x22/0x35
> Jun 26 21:07:03 hightech kernel: [<c0774eb0>]
> register_netdevice_notifier+0x64/0x16c
> Jun 26 21:07:03 hightech kernel: [<faacb000>] ? inet6_init+0x0/0x2bf [ipv6]
> Jun 26 21:07:03 hightech kernel: [<faacb358>] addrconf_init+0x6b/0x13a [ipv6]
> Jun 26 21:07:03 hightech kernel: [<faacb194>] inet6_init+0x194/0x2bf [ipv6]
> Jun 26 21:07:03 hightech kernel: [<c0401186>] do_one_initcall+0x75/0x193
> Jun 26 21:07:03 hightech kernel: [<c046c51a>] ? register_lock_class+0x25/0x2b0
> Jun 26 21:07:03 hightech kernel: [<c046d7e7>] ?
> trace_hardirqs_on_caller+0x109/0x155
> Jun 26 21:07:03 hightech kernel: [<c046c51a>] ? register_lock_class+0x25/0x2b0
> Jun 26 21:07:03 hightech kernel: [<c046d317>] ? mark_lock+0x29/0x1f3
> Jun 26 21:07:03 hightech kernel: [<c046d317>] ? mark_lock+0x29/0x1f3
> Jun 26 21:07:03 hightech kernel: [<c046d317>] ? mark_lock+0x29/0x1f3
> Jun 26 21:07:03 hightech kernel: [<c046d317>] ? mark_lock+0x29/0x1f3
> Jun 26 21:07:03 hightech kernel: [<c046d533>] ? mark_held_locks+0x52/0x7c
> Jun 26 21:07:03 hightech kernel: [<c046d317>] ? mark_lock+0x29/0x1f3
> Jun 26 21:07:03 hightech kernel: [<c046d533>] ? mark_held_locks+0x52/0x7c
> Jun 26 21:07:03 hightech kernel: [<c0810d74>] ?
> __mutex_unlock_slowpath+0x105/0x124
> Jun 26 21:07:03 hightech kernel: [<c046d800>] ?
> trace_hardirqs_on_caller+0x122/0x155
> Jun 26 21:07:03 hightech kernel: [<c046d84c>] ? trace_hardirqs_on+0x19/0x2c
> Jun 26 21:07:03 hightech kernel: [<c046025a>] ?
> __blocking_notifier_call_chain+0x5b/0x78
> Jun 26 21:07:03 hightech kernel: [<c04091bd>] ? sched_clock+0x9/0xd
> Jun 26 21:07:03 hightech kernel: [<c046c091>] ? lock_release_holdtime+0x39/0x143
> Jun 26 21:07:03 hightech kernel: [<c046025a>] ?
> __blocking_notifier_call_chain+0x5b/0x78
> Jun 26 21:07:03 hightech kernel: [<c045f64e>] ? up_read+0x29/0x4f
> Jun 26 21:07:03 hightech kernel: [<c046025a>] ?
> __blocking_notifier_call_chain+0x5b/0x78
> Jun 26 21:07:03 hightech kernel: [<c0460298>] ?
> blocking_notifier_call_chain+0x21/0x37
> Jun 26 21:07:03 hightech kernel: [<c04793c6>] sys_init_module+0xc8/0x1ee
> Jun 26 21:07:03 hightech kernel: [<c0403793>] sysenter_do_call+0x12/0x38
> Jun 26 21:07:03 hightech kernel: modprobe used greatest stack depth:
> 5296 bytes left
> Jun 26 21:07:03 hightech kernel: ip6_tables: (C) 2000-2006 Netfilter Core Team
> (...)
>
> Now, the weird thing is that this doesnt always point at modprobe. I
> have seen it occur with swapper, udevd and ip6tables-restore -at
> least-.
> For example, from another boot:
>
> Jun 26 18:46:40 hightech kernel: Pid: 545, comm: udevd Not tainted
> 2.6.31-0.28.rc1.fc12.i586 #1
> Jun 26 18:46:40 hightech kernel: Call Trace:
> Jun 26 18:46:40 hightech kernel: [<c0815123>] ? printk+0x22/0x37
> Jun 26 18:46:40 hightech kernel: [<c046cfa0>] save_trace+0x89/0xa3
> Jun 26 18:46:40 hightech kernel: [<c046d02a>] add_lock_to_list+0x70/0xb3
> Jun 26 18:46:40 hightech kernel: [<c046f224>] __lock_acquire+0x9d9/0xb08
> Jun 26 18:46:40 hightech kernel: [<c05fdd3d>] ?
> debug_check_no_obj_freed+0x7c/0x175
> Jun 26 18:46:40 hightech kernel: [<c05fdd3d>] ?
> debug_check_no_obj_freed+0x7c/0x175
> Jun 26 18:46:40 hightech kernel: [<c046f40a>] lock_acquire+0xb7/0xeb
> Jun 26 18:46:40 hightech kernel: [<c05fdd3d>] ?
> debug_check_no_obj_freed+0x7c/0x175
> Jun 26 18:46:40 hightech kernel: [<c05fdd3d>] ?
> debug_check_no_obj_freed+0x7c/0x175
> Jun 26 18:46:40 hightech kernel: [<c0817e8f>] _spin_lock_irqsave+0x45/0x89
> Jun 26 18:46:40 hightech kernel: [<c05fdd3d>] ?
> debug_check_no_obj_freed+0x7c/0x175
> Jun 26 18:46:40 hightech kernel: [<c05fdd3d>]
> debug_check_no_obj_freed+0x7c/0x175
> Jun 26 18:46:40 hightech kernel: [<c046d188>] ? trace_hardirqs_off+0x19/0x2c
> Jun 26 18:46:40 hightech kernel: [<c04ed7dd>] kmem_cache_free+0x92/0x125
> Jun 26 18:46:40 hightech kernel: [<c05f19ad>] ? idr_remove+0x125/0x144
> Jun 26 18:46:40 hightech kernel: [<c05f19ad>] idr_remove+0x125/0x144
> Jun 26 18:46:40 hightech kernel: [<c05250a4>]
> inotify_ignored_and_remove_idr+0x73/0x9b
> Jun 26 18:46:40 hightech kernel: [<c0524526>] inotify_freeing_mark+0x1c/0x2f
> Jun 26 18:46:40 hightech kernel: [<c0522da7>]
> fsnotify_destroy_mark_by_entry+0x129/0x15e
> Jun 26 18:46:40 hightech kernel: [<c0524b48>] sys_inotify_rm_watch+0x8e/0xbd
> Jun 26 18:46:40 hightech kernel: [<c0403a5c>] syscall_call+0x7/0xb
> Jun 26 18:46:40 hightech kernel: EXT3 FS on sde2, internal journal
> Jun 26 18:46:40 hightech kernel: kjournald starting.  Commit interval 5 seconds
>
> Its not the exact same kernel (first one is vanilla, the second is
> from rawhide fedora, but the results are the exact same. The origin of
> the call trace keeps jumping around from program to program with every
> boot).
> The box will hang after some minutes too (as in totally hang, not even
> screen refreshes). At start it will work for a couple of minutes tho.
>
> The system is a quad-core phenom 9600 on an asus m3a-h/hdmi board,
> with 4gb ddr2 800 ram, and 3 1TB sata hard disks.
> The system is a fedora 11 with some rawhide updates thrown in, but
> this doesnt happen on a 2.6.29 kernel, on the same system, so im
> assuming its a kernel bug and not a distro problem (ofcourse, i could
> be wrong).
>
> Im attaching .config, and the full dmesg, as well as some other
> informations (lspci -vv, cat /proc/cpuinfo). I can apply patches and
> retest at will, as this will trigger at every boot, so its easily
> reproducible on my end.
>
> Thank you for your time,
> Joao Correia
> Centro de Informatica
> Universidade da Beira Interior
> Portugal
>

Attachment: config
Description: Binary data