Re: [PATCH tip/core/rcu 3/5] sched: Make wake_up_nohz_cpu() handle CPUs going offline

From: Wanpeng Li
Date: Mon Aug 22 2016 - 18:57:34 EST


2016-08-22 23:30 GMT+08:00 Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>:
> Both timers and hrtimers are maintained on the outgoing CPU until
> CPU_DEAD time, at which point they are migrated to a surviving CPU. If a
> mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems
> will splat in native_smp_send_reschedule() when attempting to wake up
> the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel:
>
> [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40
> [ 7976.741595] Modules linked in:
> [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1
> [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
> [ 7976.741595] 0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000
> [ 7976.741595] 0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18
> [ 7976.741595] 0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00
> [ 7976.741595] Call Trace:
> [ 7976.741595] [<ffffffff8138ab2e>] dump_stack+0x67/0x99
> [ 7976.741595] [<ffffffff8105cabc>] __warn+0xcc/0xf0
> [ 7976.741595] [<ffffffff8105cb98>] warn_slowpath_null+0x18/0x20
> [ 7976.741595] [<ffffffff8103cba9>] native_smp_send_reschedule+0x39/0x40
> [ 7976.741595] [<ffffffff81089bc2>] wake_up_nohz_cpu+0x82/0x190
> [ 7976.741595] [<ffffffff810d275a>] internal_add_timer+0x7a/0x80
> [ 7976.741595] [<ffffffff810d3ee7>] mod_timer+0x187/0x2b0
> [ 7976.741595] [<ffffffff810c89dd>] rcu_torture_reader+0x33d/0x380
> [ 7976.741595] [<ffffffff810c66f0>] ? sched_torture_read_unlock+0x30/0x30
> [ 7976.741595] [<ffffffff810c86a0>] ? rcu_bh_torture_read_lock+0x80/0x80
> [ 7976.741595] [<ffffffff8108068f>] kthread+0xdf/0x100
> [ 7976.741595] [<ffffffff819dd83f>] ret_from_fork+0x1f/0x40
> [ 7976.741595] [<ffffffff810805b0>] ? kthread_create_on_node+0x200/0x200
>
> However, in this case, the wakeup is redundant, because the timer
> migration will reprogram timer hardware as needed. Note that the fact
> that preemption is disabled does not avoid the splat, as the offline
> operation has already passed both the synchronize_sched() and the
> stop_machine() that would be blocked by disabled preemption.
>
> This commit therefore modifies wake_up_nohz_cpu() to avoid attempting
> to wake up offline CPUs. It also adds a comment stating that the
> caller must tolerate lost wakeups when the target CPU is going offline,
> and suggesting the CPU_DEAD notifier as a recovery mechanism.

Interesting, I have a patch which posted several weeks ago fix another
similar issue, https://lkml.org/lkml/2016/8/4/143 Anyway, if my patch
also fixes your bug?

Regards,
Wanpeng Li