[patch 8/9] x86: fix cpu hotplug crash

From: Greg KH
Date: Tue Jul 01 2008 - 11:24:45 EST


2.6.25-stable review patch. If anyone has any objections, please let us
know.

------------------

From: Yanmin Zhang <yanmin_zhang@xxxxxxxxxxxxxxx>

Commit fcb43042ef55d2f46b0efa5d7746967cef38f056 upstream

x86: fix cpu hotplug crash

Vegard Nossum reported crashes during cpu hotplug tests:

http://marc.info/?l=linux-kernel&m=121413950227884&w=4

In function _cpu_up, the panic happens when calling
__raw_notifier_call_chain at the second time. Kernel doesn't panic when
calling it at the first time. If just say because of nr_cpu_ids, that's
not right.

By checking the source code, I found that function do_boot_cpu is the culprit.
Consider below call chain:
_cpu_up=>__cpu_up=>smp_ops.cpu_up=>native_cpu_up=>do_boot_cpu.

So do_boot_cpu is called in the end. In do_boot_cpu, if
boot_error==true, cpu_clear(cpu, cpu_possible_map) is executed. So later
on, when _cpu_up calls __raw_notifier_call_chain at the second time to
report CPU_UP_CANCELED, because this cpu is already cleared from
cpu_possible_map, get_cpu_sysdev returns NULL.

Many resources are related to cpu_possible_map, so it's better not to
change it.

Below patch against 2.6.26-rc7 fixes it by removing the bit clearing in
cpu_possible_map.

Signed-off-by: Zhang Yanmin <yanmin_zhang@xxxxxxxxxxxxxxx>
Tested-by: Vegard Nossum <vegard.nossum@xxxxxxxxx>
Acked-by: Rusty Russell <rusty@xxxxxxxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>


---
arch/x86/kernel/smpboot_64.c | 1 -
1 file changed, 1 deletion(-)

--- a/arch/x86/kernel/smpboot_64.c
+++ b/arch/x86/kernel/smpboot_64.c
@@ -704,7 +704,6 @@ do_rest:
clear_bit(cpu, (unsigned long *)&cpu_initialized); /* was set by cpu_init() */
clear_node_cpumask(cpu); /* was set by numa_add_cpu */
cpu_clear(cpu, cpu_present_map);
- cpu_clear(cpu, cpu_possible_map);
per_cpu(x86_cpu_to_apicid, cpu) = BAD_APICID;
return -EIO;
}

--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/