Re: rcu_bh stalls on 3.2.28

From: Ben Hutchings
Date: Sun Sep 09 2012 - 14:12:59 EST


Please note that I can only directly deal with regressions that are
specific to 3.2, caused by a bad backport. For anything else, you need
to identify an upstream fix to be applied - I'm not usually going to
have the time to do that.

On Fri, 2012-08-31 at 20:02 -0300, Henrique de Moraes Holschuh wrote:
> Just got one of these:
>
> kernel: INFO: rcu_bh detected stall on CPU 2 (t=0 jiffies)
> kernel: Pid: 0, comm: swapper/2 Not tainted 3.2.28+ #2
> kernel: Call Trace:
> kernel: <IRQ> [<ffffffff810d1609>] __rcu_pending+0x159/0x400
> kernel: [<ffffffff810d20bb>] rcu_check_callbacks+0x9b/0x120
> kernel: [<ffffffff81089673>] update_process_times+0x43/0x80
> kernel: [<ffffffff810a836f>] tick_sched_timer+0x5f/0xb0
> kernel: [<ffffffff8109c097>] __run_hrtimer.isra.30+0x57/0x100
> kernel: [<ffffffff8109c8f5>] hrtimer_interrupt+0xe5/0x220
> kernel: [<ffffffff8104ce14>] smp_apic_timer_interrupt+0x64/0xa0
> kernel: [<ffffffff8159b5cb>] apic_timer_interrupt+0x6b/0x70
> kernel: <EOI> [<ffffffff81315645>] ? intel_idle+0xe5/0x140
> kernel: [<ffffffff81315623>] ? intel_idle+0xc3/0x140
> kernel: [<ffffffff814420ee>] cpuidle_idle_call+0x8e/0xf0
> kernel: [<ffffffff81032425>] cpu_idle+0xa5/0x110
> kernel: [<ffffffff8158a9ac>] start_secondary+0x1e5/0x1ec
>
> There are previous reports of these weird rcu_bh stalls with t=0 in the 3.2
> and 3.3 branches as well:
>
> https://lkml.org/lkml/2012/2/18/34
> http://lkml.org/lkml/2012/3/28/175
>
> another data point:
> https://bugzilla.redhat.com/show_bug.cgi?id=806610

Says it was fixed in (Fedora's) 3.3 - so perhaps there are multiple bugs
involved.

Ben.

--
Ben Hutchings
Time is nature's way of making sure that everything doesn't happen at once.

Attachment: signature.asc
Description: This is a digitally signed message part