Re: [Help Test] kdump, x86, acpi: Reproduce CPU0 SMI corruption issue after unsetting BSP flag
From: Eric W. Biederman
Date: Wed Aug 14 2013 - 15:46:28 EST
Jingbai Ma <jingbai.ma@xxxxxx> writes:
> I found a side effect of unsetting BSP flag.
> It affected system rebooting, once the BSP flags been removed, and issue
> reboot command, system will hang after message:
> Restarting system.
> And have to do a hardware reset to recover it.
>
> I have reproduced this problem on the following systems:
> HP EliteBook 6930p
> HP Compaq DC7700
> HP ProLiant DL980 (4 sockets, 40 cores)
>
> I have an idea: To avoid such kind of issue, we can unset BSP flag in
> the first kernel during crash processing, and restore it in the second
> kernel in the APs initializing.
The premise was clearing BSP would not be an issue. If we could
reliably count on unsetting the BSP during crash processing we could
just switch to the BSP and be done totally avoid this problem.
Given that there are reald world issues with clearing the BSP flag,
I believe the alternate suggestion was to simply never attempt to start
the bootstrap processor during processor bring up.
If as normal we are running on the bootstrap processor everything will
work the same, but if we are in the kdump scenario we will be short one
core. Being short one core seems like a reasonable tradeoff between
reliability and performance.
Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/