RCU used on incoming CPU before rcu_cpu_starting() called

From: Paul E. McKenney
Date: Thu Mar 09 2017 - 01:08:11 EST


Hello!

I am seeing the following splat in rcutorture testing of v4.11-rc1:

[ 30.694013] =============================
[ 30.694013] WARNING: suspicious RCU usage
[ 30.694013] 4.11.0-rc1+ #1 Not tainted
[ 30.694013] -----------------------------
[ 30.694013] /home/git/linux-2.6-tip/kernel/workqueue.c:712 sched RCU or wq_pool_mutex should be held!
[ 30.694013]
[ 30.694013] other info that might help us debug this:
[ 30.694013]
[ 30.694013]
[ 30.694013] RCU used illegally from offline CPU!
[ 30.694013] rcu_scheduler_active = 2, debug_locks = 0
[ 30.694013] no locks held by swapper/1/0.
[ 30.694013]
[ 30.694013] stack backtrace:
[ 30.694013] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.11.0-rc1+ #1
[ 30.694013] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[ 30.694013] Call Trace:
[ 30.694013] dump_stack+0x67/0x99
[ 30.694013] lockdep_rcu_suspicious+0xe7/0x120
[ 30.694013] get_work_pool+0x82/0x90
[ 30.694013] __queue_work+0x70/0x5f0
[ 30.694013] queue_work_on+0x33/0x70
[ 30.694013] clear_sched_clock_stable+0x33/0x40
[ 30.694013] early_init_intel+0xe7/0x2f0
[ 30.694013] init_intel+0x11/0x350
[ 30.694013] identify_cpu+0x344/0x5a0
[ 30.694013] identify_secondary_cpu+0x18/0x80
[ 30.694013] smp_store_cpu_info+0x39/0x40
[ 30.694013] start_secondary+0x4e/0x100
[ 30.694013] start_cpu+0x14/0x14

Here is the relevant code from x86's smp_callin():

/*
* Save our processor parameters. Note: this information
* is needed for clock calibration.
*/
smp_store_cpu_info(cpuid);

/*
* Get our bogomips.
* Update loops_per_jiffy in cpu_data. Previous call to
* smp_store_cpu_info() stored a value that is close but not as
* accurate as the value just calculated.
*/
calibrate_delay();
cpu_data(cpuid).loops_per_jiffy = loops_per_jiffy;
pr_debug("Stack at about %p\n", &cpuid);

/*
* This must be done before setting cpu_online_mask
* or calling notify_cpu_starting.
*/
set_cpu_sibling_map(raw_smp_processor_id());
wmb();

notify_cpu_starting(cpuid);

The problem is that smp_store_cpu_info() indirectly invokes
schedule_work(), which wants to use RCU. But RCU isn't informed
of the incoming CPU until the call to notify_cpu_starting(), which
causes lockdep to complain bitterly about the use of RCU by the
premature call to schedule_work().

I considered just moving the notify_cpu_starting() earlier in the
sequence, but the comments make it seem like this would not be
a wise choice.

Any suggestions?

Thanx, Paul