Re: Kernel panic: Route cache, RCU, possibly FIB trie.

From: Dipankar Sarma
Date: Mon Mar 20 2006 - 17:09:23 EST


On Mon, Mar 20, 2006 at 10:44:21PM +0100, Jesper Dangaard Brouer wrote:
>
> Kernel panic report.
>
> Have experienced some kernel panic's on a production Linux box acting
> as a router for a large number of customers.
>
> I have tried to track down the problem, and I think I have narrowed it
> a bit down. My theory is that it is related to the route cache
> (ip_dst_cache) or FIB, which cannot dealloacate route cache slab
> elements (maybe RCU related). (I have seen my route cache increase to
> around 520k entries using rtstat, before dying).
>
> I'm using the FIB trie system/algorithm (CONFIG_IP_FIB_TRIE). Think
> that the error might be cause by the "fib_trie" code. See the syslog,
> output below.
>
> Below are some kernel panic outputs from the console and some
> interesting errors found in syslog.
>
> Kernel panic#1
> --------------
> EIP is at _stext+0x3feffd68/0x49
> c03f7380
> Call Trace:
> [<c0103cc7>] show_stack+0x80/0x96
> [<c0103e60>] show_registers+0x161/0x1c5
> [<c0104057>] die+0x107/0x186
> [<c0116c5f>] do_page_fault+0x2c6/0x57d
> [<c0103997>] error_code+0x4f/0x54
> [<c012fe7b>] __rcu_process_callbacks+0xaa/0xd3
> [<c012feff>] rcu_process_callbacks+0x5b/0x65
> [<c0124578>] tasklet_action+0x77/0xc9
> [<c01241f1>] __do_softirq+0xc1/0xd6
> [<c0124251>] do_softirq+0x4b/0x4d
> [<c012433b>] irq_exit+0x47/0x49
> [<c010533b>] do_IRQ+0x2b/0x3b
> [<c010383e>] common_interrupt+0x1a/0x20
> Code: Bad EIP value.
> <0>Kernel panic - not syncing: Fatal exception in interrupt

Bad eip in processing rcu callback often indicates that the object
that embeds the rcu_head has already been freed. Can you enable
slab debugging and see if this can be detected there in a different
path ?

Thanks
Dipankar

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/