On Wed, 9 May 2012, Igor Mammedov wrote:What's fixed:
When bringing up cpuX1, it could stall in start_secondary
before setting cpu_callin_mask for more than 5 sec. That forces
do_boot_cpu() to give up on waiting and go to error return path
printing messages:
pr_err("CPU%d: Stuck ??\n", cpuX1);
or
pr_err("CPU%d: Not responding.\n", cpuX1);
and native_cpu_up returns early with -EIO. However AP may continue
its boot process till it reaches check_tsc_sync_target(), where
it will wait for boot cpu to run cpu_up...=>check_tsc_sync_source.
That will never happen since cpu_up have returned with error before.
Now we need to note that cpuX1 is marked as active in smp_callin
before it stuck in check_tsc_sync_target. And when another cpuX2
is being onlined, start_secondary on it will call
smp_callin
-> smp_store_cpu_info
-> identify_secondary_cpu
-> mtrr_ap_init
-> set_mtrr_from_inactive_cpu
-> stop_machine_from_inactive_cpu
where it's going to schedule stop_machine work on all ACTIVE cpus
smdata.num_threads = num_active_cpus() + 1;
and wait till they all complete it before continuing. As was noted
before cpuX1 was marked as active but can't execute any work since
it's not completed initialization and stuck in check_tsc_sync_target.
As result system soft lockups in stop_machine_cpu_stop.
backtrace from reproducer:
PID: 3324 TASK: ffff88007c00ae20 CPU: other cpus COMMAND: "migration/1"
[exception RIP: stop_machine_cpu_stop+131]
...
#0 [ffff88007b4d7de8] cpu_stopper_thread at ffffffff810c66bd
#1 [ffff88007b4d7ee8] kthread at ffffffff8107871e
#2 [ffff88007b4d7f48] kernel_thread_helper at ffffffff8154af24
PID: 0 TASK: ffff88007c029710 CPU: 2 COMMAND: "swapper/2"
[exception RIP: check_tsc_sync_target+33]
...
#0 [ffff88007c025f30] start_secondary at ffffffff81539876
PID: 0 TASK: ffff88007c041710 CPU: 3 COMMAND: "swapper/3"
[exception RIP: stop_machine_cpu_stop+131]
...
#0 [ffff88007c04be50] stop_machine_from_inactive_cpu at ffffffff810c6b2f
#1 [ffff88007c04bee0] mtrr_ap_init at ffffffff8102e963
#2 [ffff88007c04bf10] identify_secondary_cpu at ffffffff81536799
#3 [ffff88007c04bf20] smp_store_cpu_info at ffffffff815396d5
#4 [ffff88007c04bf30] start_secondary at ffffffff81539800
Could be fixed by not marking being onlined cpu as active too early.
This explanation is completely useless. What's fixed by what. And is
it fixed or could it be fixed?
This also want's an explanation why moving the cpu_starting notifierI've checked in kernel users [sched, kvm, pmu] before moving it here.
does not hurt any assumptions of the code which has notifiers
registered for CPU_STARTING.
In fact your change can result inThat's certainly is not correct, it asks for a barrier after
CPU_ONLINE notifier being called _BEFORE_ CPU_STARTING. Do you really
think that's correct?
Aside of that your whole patch series tackles the wrong aspect.patch series tries to prevent a failed to boot cpu wreck havoc on
Why the heck do you need extra magic in check_tsc_sync_target() ?Because it's plainly racy. patch 2/5 describes/fixes race condition in
If the booting CPU fails to set the callin map within 5 seconds thenWhy it shouldn't reach check_tsc_sync_target () at all. There is nothing
it should not even reach check_tsc_sync_target() at all.
And just for the record, the new CPU can run into the very sameYes, it can.
timeout problem, when the boot CPU fails to set the callout mask.
This whole stuff is a complete trainwreck already and I don't want toFixing slow to respond cpu might be not option, so we need to gracefully
see anything like your "fixing the symptoms" hackery near it, really.
Rewrite will need to deal with failed to boot in time cpu as well.
This whole stuff needs a proper rewrite and not some more braindamaged
bandaids. And if we apply bandaids for the time being, then certainly
not bandaids like the mess you created.
Thanks,
tglx