Re: Need your help on rcu_bh_state stall on CPU 0 (t=0 jiffies)

From: Paul E. McKenney
Date: Mon Feb 20 2017 - 22:01:17 EST


On Tue, Feb 21, 2017 at 06:36:34AM +0530, Sumit Gemini wrote:
> Hi Pual,
>
> Thanks for your time.
>
> we are using HA pair with TIPC and opensf functionality. Now problem is on
> active machine TIPC link lost message was observed on 06:45:00 but standby
> machine detected this link lost at 06:45:06 because at 06:45:00 standby
> machine having rcu_bh_state (CPU stall). Once standby machine detected link
> lost, it fired reboot command for active machine with the help of opensf at
> 06:45:07 to make itself active and standby to previously active machine.
>
> I raised this matter before TIPC open source community, and they told
> rcu_bh_state was the reason for delaying to get link down event at standby
> end.
>
> In normal case TIPC link broken detects in < 2 second. But in our case took
> 7 second because of that stall. It is first time, we observed such
> situation.
>
> Now if we suppres this warning, will we observe this delay again?

That depends. If the delay was due to the NMIs and stack-trace printing,
suppressing the warning will make the delay go away. On the other hand,
if there really is a stall independent of the warning, then the delay
will still be there. Why not try it and find out?

> Can you enlighten on this stall. Why does this stall happen at ideal
> machine? And if possible to debug it, what we need to do?
>
> Any pointers will be highly appreciated.

I suggest reading Documentation/stallwarn.txt, especially the list
of possible causes near the end:

https://www.kernel.org/doc/Documentation/RCU/stallwarn.txt

If the delay persist when you suppress the warnings, then I suggest that
you use something like ftrace to gather information on the delay.

Thanx, Paul

> Thanks & Regards
> ~Sumit Gemini
>
>
> On Feb 21, 2017 2:28 AM, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> wrote:
>
> > On Mon, Feb 20, 2017 at 03:26:32PM +0530, Sumit Gemini wrote:
> > > Hi Paul,
> > >
> > >
> > > I got you while reading 'http://www.serverphorums.com/read.php?12,468996";.
> > I'm
> > > facing same problem but not able to detect what exectly the cause of
> > > it.
> > >
> > >
> > > I'm observed weird problem, I got rcu_bh_state on cpu0 but my system
> > > was in ideal state when i got rcu_bh_state (Cpu stall) problem. After
> > > 7 second my system recovered itself from stall to active. But i did
> > > not get what happen at that time.
> >
> > One thing that you might look into below...
> >
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.520060] INFO: rcu_bh_state
> > detected stall on CPU 0 (t=0 jiffies)
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] sending NMI to all
> > CPUs:
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] NMI backtrace for cpu
> > 0
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] CPU 0
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] Modules linked in:
> > nf_conntrack_netlink af_packet xt_sharedlimit xt_hashlimit
> > ip_set_hash_ipport ip_set_hash_ipportip xt_NOTRACK ip_set_bitmap_port
> > xt_sctp nf_conntrack_ipv6 nf_defrag_ipv6 xt_CT arpt_mangle
> > ip_set_hash_ipnet xt_NFLOG nfnetlink_log ipt_ULOG xt_limit xt_hashcounter
> > ip_set_hash_ipip xt_set ip_set_hash_ip deflate zlib_deflate ctr
> > twofish_x86_64 twofish_common camellia serpent blowfish cast5 des_generic
> > cbc xcbc rmd160 sha512_generic sha256_generic sha1_generic md5 crypto_null
> > af_key iptable_mangle ip_set nfnetlink arptable_filter arp_tables
> > iptable_raw iptable_nat tipc xt_tcpudp xt_state xt_pkttype bonding
> > binfmt_misc iptable_filter ip6table_filter ip6_tables nf_nat_ftp nf_nat
> > nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables x_tables mperf
> > edd ipmi_devintf ipmi_si ipmi_msghandler nf_conntrack_proto_sctp
> > nf_conntrack sctp 8021q garp stp llc gb_sys usb_storage ioatdma ixgbe uas
> > sg igb iTCO_wdt wmi i2c_i801 pcspkr mdio iTCO_vendor_support button
> > container dca ipv6 autofs4 usbhid megasr(P) ehci_hcd usbcore processor
> > thermal_sys [last unloaded: ipt_PORTMAP]
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042]
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] Pid: 0, comm: swapper
> > Tainted: P 3.1.10-gb17-default #1 Intel Corporation S2600CO/S2600CO
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] RIP:
> > 0010:[<ffffffff81007f51>] [<ffffffff81007f51>] native_read_tsc+0x2/0xf
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] RSP:
> > 0018:ffff88043ee03db0 EFLAGS: 00000803
> > > Jan 6 16:45:06 ffm-sbc-2a kernel: [3167216.524042] RAX: 0000000037185395
> > RBX: 00000000000003e9 RCX: 0000000000000001
> > > Jan 6 16:45:07 ffm-sbc-2a osafimmd[5035]: WA IMMND DOWN on active
> > controller f2 detected at standby immd!! f1. Possible failover
> > > Jan 6 16:45:07 ffm-sbc-2a osaffmd[5023]: NO Done Locking applications
> > node id:2020f ret val:0
> > > Jan 6 16:45:07 ffm-sbc-2a opensaf_recovery: Control interface status:0
> > Role:STANDBY
> > > Jan 6 16:45:07 ffm-sbc-2a osaffmd[5023]: NO Current role: STANDBY
> > > Jan 6 16:45:07 ffm-sbc-2a osaffmd[5023]: Rebooting OpenSAF NodeId =
> > 131599 EE Name = , Reason: Received Node Down for peer controller, >
> > OwnNodeId = 131343, SupervisionTime = 60
> >
> > Is this "WA IMMND DOWN" a normal occurrence? Perhaps it is pointing
> > at a root cause.
> >
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] RDX: 0000000000bf0977
> > RSI: 0000000000000002 RDI: 0000000000032bd4
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] RBP: 0000000000032bd4
> > R08: 0000000000000000 R09: ffffffff819232b0
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] R10: 7fffffffffffffff
> > R11: 7fffffffffffffff R12: 0000000000000000
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] R13: ffffffff819232b0
> > R14: 0000000000000001 R15: ffffffff81065c28
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] FS:
> > 0000000000000000(0000) GS:ffff88043ee00000(0000) knlGS:0000000000000000
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] CS: 0010 DS: 0000 ES:
> > 0000 CR0: 000000008005003b
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] CR2: 000000000069e034
> > CR3: 0000000001805000 CR4: 00000000000406f0
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] DR0: 0000000000000000
> > DR1: 0000000000000000 DR2: 0000000000000000
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] DR3: 0000000000000000
> > DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Process swapper (pid:
> > 0, threadinfo ffffffff81800000, task ffffffff8180d020)
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Stack:
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] ffffffff81200eb5
> > ffffffff81200f44 00000000000003e9 0000000000001000
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] 0000000000000002
> > ffffffff819232b0 ffffffff81017698 7fffffffffffffff
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] 0000000000000002
> > 0000000000000002 ffffffff81017fdf 0000000000000001
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Call Trace:
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81200eb5>]
> > paravirt_read_tsc+0x5/0x8
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81200f44>]
> > delay_tsc+0x1d/0x68
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81017698>]
> > native_safe_apic_wait_icr_idle+0x27/0x32
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81017fdf>]
> > default_send_IPI_dest_field.constprop.0+0x19/0x4d
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8101804b>]
> > default_send_IPI_mask_sequence_phys+0x38/0x6a
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8101815e>]
> > arch_trigger_all_cpu_backtrace+0x4d/0x7b
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8109801b>]
> > check_cpu_stall+0x66/0xdb
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff810980aa>]
> > rcu_pending+0x1a/0x10a
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8109852c>]
> > rcu_check_callbacks+0x9d/0xae
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8104c56c>]
> > update_process_times+0x31/0x63
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81065c92>]
> > tick_sched_timer+0x6a/0x90
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8105b872>]
> > __run_hrtimer+0xa4/0x148
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8105c08e>]
> > hrtimer_interrupt+0xdb/0x19a
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81017768>]
> > smp_apic_timer_interrupt+0x6e/0x80
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff813efcde>]
> > apic_timer_interrupt+0x6e/0x80
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81239655>]
> > intel_idle+0xdd/0x104
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81304773>]
> > cpuidle_idle_call+0xda/0x169
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff81001200>]
> > cpu_idle+0x51/0x95
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8193db0f>]
> > start_kernel+0x388/0x393
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] [<ffffffff8193d3af>]
> > x86_64_start_kernel+0xcf/0xdc
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.524042] Code: 74 03 e6 80 c3
> > e6 ed c3 bf 8e 21 00 00 e9 ba 8f 1f 00 c3 90 90 90 40 88 f8 e6 70 e4 71 c3
> > 40 88 f0 e6 70 40 88 f8 e6 71 c3 0f 31
> > > Jan 6 16:45:07 ffm-sbc-2a kernel[3167216.524042]: c1 48 89 d0 48 c1 e0
> > 20 48 09 c8 c3 41 56 41 55 41 54 55 53
> > >
> > > .
> > > .
> > > .
> > > .
> > > .
> > > .
> > >
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] NMI backtrace for cpu
> > 31
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] CPU 31
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Modules linked in:
> > nf_conntrack_netlink af_packet xt_sharedlimit xt_hashlimit
> > ip_set_hash_ipport ip_set_hash_ipportip xt_NOTRACK ip_set_bitmap_port
> > xt_sctp nf_conntrack_ipv6 nf_defrag_ipv6 xt_CT arpt_mangle
> > ip_set_hash_ipnet xt_NFLOG nfnetlink_log ipt_ULOG xt_limit xt_hashcounter
> > ip_set_hash_ipip xt_set ip_set_hash_ip deflate ctr twofish_x86_64
> > twofish_common camellia serpent > blowfish cast5 des_generic cbc xcbc
> > rmd160 sha512_generic sha256_generic sha1_generic md5 crypto_null af_key
> > iptable_mangle ip_set nfnetlink arptable_filter arp_tables iptable_raw
> > iptable_nat tipc xt_tcpudp xt_state xt_pkttype bonding binfmt_misc
> > iptable_filter ip6table_filter ip6_tables nf_nat_ftp nf_nat
> > nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables x_tables mperf
> > edd ipmi_devintf ipmi_si ipmi_msghandler nf_conntrack_proto_sctp
> > nf_conntrack sctp 8021q garp stp llc gb_sys usb_storage ioatdma ixgbe uas
> > sg igb iTCO_wdt wmi i2c_i801 pcspkr mdio iTCO_vendor_support button
> > container dca ipv6 autofs4 usbhid megasr(P) ehci_hcd usbcore processor
> > thermal_sys [last unloaded: ipt_PORTMAP]
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734]
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Pid: 0, comm:
> > kworker/0:1 Tainted: P 3.1.10-gb17-default #1 Intel Corporation
> > S2600CO/S2600CO
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RIP:
> > 0010:[<ffffffff81239624>] [<ffffffff81239624>] intel_idle+0xac/0x104
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RSP:
> > 0018:ffff880425e33eb8 EFLAGS: 00000046
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RAX: 0000000000000030
> > RBX: 0000000000000010 RCX: 0000000000000001
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RDX: 0000000000000000
> > RSI: ffff880425e33fd8 RDI: ffffffff81810500
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] RBP: 0000000000000030
> > R08: 000000000000006d R09: 0000000000034b10
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] R10: ffff88083eded830
> > R11: ffff88083eded830 R12: 149739342cb2ca49
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] R13: 0000000000000004
> > R14: 000000000000001f R15: 0000000000000000
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] FS:
> > 0000000000000000(0000) GS:ffff88083ede0000(0000) knlGS:0000000000000000
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] CS: 0010 DS: 0000 ES:
> > 0000 CR0: 000000008005003b
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] CR2: 00007fe89df67120
> > CR3: 0000000001805000 CR4: 00000000000406e0
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] DR0: 0000000000000000
> > DR1: 0000000000000000 DR2: 0000000000000000
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] DR3: 0000000000000000
> > DR6: 00000000ffff0ff0 DR7: 0000000000000400
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Process kworker/0:1
> > (pid: 0, threadinfo ffff880425e32000, task ffff880425e30580)
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167216.830734] Stack:
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.685484] TIPC: Resetting link
> > <1.1.1:bond0-1.1.2:bond0>, requested by peer
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.685487] TIPC: Lost link
> > <1.1.1:bond0-1.1.2:bond0> on network plane A
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.685491] TIPC: Lost contact
> > with <1.1.2>
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] 0000000000000000
> > 000000000cdd3a47 0000000000000000 000000000cdd3a47
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] ffff880425e33fd8
> > 0000001f3edf8970 ffff88083edf8970 ffff88083edf8b00
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] 0000000000000000
> > ffffffff81304773 ffffffff819232b0 ffff880425e33fd8
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] Call Trace:
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] [<ffffffff81304773>]
> > cpuidle_idle_call+0xda/0x169
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] [<ffffffff81001200>]
> > cpu_idle+0x51/0x95
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] Code: 28 e0 ff ff 80
> > e2 08 75 22 31 d2 48 83 c0 10 48 89 d1 0f 01 c8 0f ae f0 48 8b 86 38 e0 ff
> > ff a8 08 75 08 b1 01 48 89 e8 0f 01 c9 <e8> 3f 6e e2 ff 4c 29 e0 48 89 c7
> > e8 10 ae e0 ff 48 69 e8 40 42
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] Call Trace:
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] [<ffffffff81304773>]
> > cpuidle_idle_call+0xda/0x169
> > > Jan 6 16:45:07 ffm-sbc-2a kernel: [3167225.687214] [<ffffffff81001200>]
> > cpu_idle+0x51/0x95
> >
> > Since NMIs were sent to all CPUs, it is not surprising that one was found
> > to be idle. Or is there something that I am missing in CPU 31's
> > backtrace?
> >
> > > I'm using suse kernel 3.1.10
> >
> > That is going back a ways!
> >
> > There are quite a few patches to RCU CPU stall warnings that could
> > be applied. It might be easier to move to a newer kernel.
> >
> > Alternatively, if you are convinced that these stall warnings are
> > false positives, you could boot with the rcu_cpu_stall_suppress
> > kernel boot parameter set, or rebuild the kernel with a much larger
> > value of the RCU_CPU_STALL_TIMEOUT Kconfig option.
> >
> > Thanx, Paul
> >
> >