Mark Lord <lkml@xxxxxx> writes:..
Eric W. Biederman wrote:Mark Lord <lkml@xxxxxx> writes:You could search kernel archives from last week,Ok.- It could be that we changed the initialization order.Right. No fancy command line overrides here.
- It could be that we changed the set of sysdev devices.
- It could be we that cpu_down is doing something extra that we are not
doing.
My guess is that Mark Lord, and Thomas Gleixner are sharp enough to
have checked their command line parameters before submitting a kernel
bugfix patch. Although it would not hurt to have confirmation of
that.
Simple questions.
- 32bit or 64bit x86?
- Is this new failure mode or a regression?
- By hang on power off I assume you mean that the power does
not go off?
- Is there anything else you can tell me about this failure mode?
when lots more info was posted in the original trouble reports.
Sure. I will. I was starting off a bit lazy.
My system is Core2Duo DualCore 1.86GHz, pure 32bit kernel/user.
System prints "Power Down." message, but power stays on.
Broken in 2.6.17, and in 2.6.23-rc*. No other versions attempted.
Ok. So apparently not a regression.
Mark, Thomas given the difficulty in reproducing the failure ifdisable_nonboot_cpus)
someone has time I would love to see what happens if for testing:
set_cpus_allowed(current, cpumask_of_cpu(1)); (instead of
System behaved fine on six consecutive poweroffs with that.
This could be due to subtle timing changes induced by the
context switches this produces, though. I also tried it
as the very first line in the same function, and saw no change.
Thanks. This confirms something. This is not an issue of which
cpu you are running on (or else by forcing you onto the second
core it would always fail). At most it may be something
happening on one cpu and the getting switched to another cpu.
So this looks like something extra that cpu_down is doing.
I then removed that one line (and left out the disable_nonboot_cpus as well),
and it failed to power off on the very next attempt (got lucky).
Ok.
Changed it to set_cpus_allowed(current, cpumask_of_cpu(0)),
and it survived six consecutive poweroffs.
Changed it back to disable_nonboot_cpus(..),
and it also survived another four consecutive poweroffs.
I also verified that both CPUs were enabled on entry to machine_poweroff().
That both CPUS are enabled on entry to machine_poweroff is expected,
and normal.
The code path on i386 should be:
machine_power_off
native_machine_power_off
machine_shutdown(); (which disables the other cpus)
smp_call_function
stop_this_cpu (on each cpu to be stopped.
pm_power_off(); (which turns off the power)
This does sound like a race of some sort.