Re: [Bisected] Regression: Hang on boot in schedule_timeout_interruptible during ACPI init on SMP

From: Glauber Costa
Date: Wed Jul 23 2008 - 20:31:25 EST


On Wed, Jul 23, 2008 at 7:46 PM, Andrew Drake <zappacky.lists@xxxxxxxxx> wrote:
> Here's a puzzler for you all,
>
> On my laptop (an ACPI-enabled SMP system), the system hangs during the
> "acpi_init" function. I traced it to the schedule_timeout_interruptible
> function, which is called if Sleep() is encountered in the DSDT code in one
> of
> the _STA or _INI functions. In this case, I have one in each, and it hangs
> twice.
>
> The value being passed to schedule_timeout_interruptible is sane (i.e. <=
> 25),
> but the function never returns. Triggering an interrupt (i.e. jiggling the
> power
> button) causes the boot to continue.
>
> Passing nosmp causes the problem to disappear (but at what an expense!),
>
> I noticed this in the latest kernel, and in some 2.6.25-ish kernels, decided
> to
> hunt it down. On Linus's tree, the latest good commit was:
>
> commit 1161705bd66df0c80fa45e87190e456c02e6f145
> Author: Ingo Molnar <mingo@xxxxxxx>
> Date: Wed Mar 19 20:26:15 2008 +0100
>
> x86: fill cpu to apicid and present map in mpparse, fix
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>
>
> and the earliest bad commit was:
>
> commit 802b8133b4f78c30a2668d142d78861e27c0c6a7
> Author: Glauber de Oliveira Costa <gcosta@xxxxxxxxxx>
> Date: Wed Mar 19 14:25:41 2008 -0300
>
> x86: schedule work only if keventd is already running
> Only call schedule_work if keventd is already running.
> This is already the way x86_64 does
> Signed-off-by: Glauber Costa <gcosta@xxxxxxxxxx>
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>
> There's about 14 commits in-between these two; I was unable to bisect any
> further because all 14 of the in-between commits either oops, panic, or
> hang setting up the timer (it appears that the commit immediately following
> the known-good one introduces the timer failure, which lasts up until the
> known-bad one).
>
> The change "x86: schedule work only if keventd is already running" modifies
> smp_boot, which puts it, in my mind, as the most likely culprit.

on the other hand, the commit right after the good one modifies the
mptable, and may have messed the apic ids for you box. Could you please
attach a dmesg with apic debug messages turned on? Whatever commit you
can boot is fine, provided you tell me which one you used.

>
> Anybody have any ideas? I'm willing to write a patch if somebody can help me
> track down the root cause.

Any other information about your box may be helpful. cat /proc/cpuinfo
and the dmesg I told you about are good starts. If that's too much
information for a mail, opening a BZ ticket is fine for me too.

>
> Thanks,
>
> Andrew
>
> P.S. I'm willing to provide any information that you'd like to see, like
> my .config or my DSDT (disassembled or otherwise). I didn't include it
> in this email because I wasn't sure what would be helpful.
Yeah, sure. .config helps too.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>



--
Glauber Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/