Re: "APIC error on CPU1: 00(40)" during resume

From: Ingo Molnar
Date: Sun Dec 21 2008 - 03:30:42 EST



* Frans Pop <elendil@xxxxxxxxx> wrote:

> On Wednesday 10 December 2008, Ingo Molnar wrote:
> > regarding those APIC error messages:
> > > ACPI: Waking up from system sleep state S3
> > > APIC error on CPU1: 00(40)
> > > ACPI: EC: non-query interrupt received, switching to interrupt
> >
> > that does suggest that the APIC was re-enabled (we dont get any APIC
> > error exceptions otherwise!), and its LVT was programmed as well, but
> > somehow we got an erroneous APIC message from an illegal vector.
>
> I wonder if this may help tracing the cause. Today I got a KERN_ERR in the
> middle of those messages:
>
> ACPI: Waking up from system sleep state S3
> BUG: sleeping function called from invalid context at kernel/sched.c:5571
> in_atomic(): 0, irqs_disabled(): 1, pid: 70, name: kacpid
> Pid: 70, comm: kacpid Not tainted 2.6.28-rc7-rjw #77
> Call Trace:
> [<ffffffff80345013>] ? acpi_os_release_object+0x9/0xd
> [<ffffffff8022d588>] __might_sleep+0xcf/0xd1
> [<ffffffff80234e32>] __cond_resched+0x15/0x4b
> [<ffffffff8043b1d7>] _cond_resched+0x2d/0x38
> [<ffffffff80357bc9>] acpi_ps_complete_op+0x235/0x24b
> [<ffffffff803582de>] acpi_ps_parse_loop+0x6ff/0x859
> [<ffffffff803574db>] acpi_ps_parse_aml+0x7c/0x2bb
> [<ffffffff80358a35>] acpi_ps_execute_method+0x144/0x213
> [<ffffffff80354e9e>] acpi_ns_evaluate+0x152/0x230
> [<ffffffff803452d0>] ? acpi_os_execute_deferred+0x0/0x39
> [<ffffffff8034c6a6>] acpi_ev_asynch_execute_gpe_method+0xc1/0x119
> [<ffffffff803452fc>] acpi_os_execute_deferred+0x2c/0x39
> [<ffffffff802480c7>] run_workqueue+0x95/0x12a
> [<ffffffff80248251>] worker_thread+0xf5/0x109
> [<ffffffff8024b9cb>] ? autoremove_wake_function+0x0/0x38
> [<ffffffff8024815c>] ? worker_thread+0x0/0x109
> [<ffffffff8024b678>] kthread+0x49/0x76
> [<ffffffff8020d009>] child_rip+0xa/0x11
> [<ffffffff8022da41>] ? pick_next_task_fair+0x8b/0x93
> [<ffffffff8024b62f>] ? kthread+0x0/0x76
> [<ffffffff8020cfff>] ? child_rip+0x0/0x11
> APIC error on CPU1: 00(40)
> ACPI: EC: non-query interrupt received, switching to interrupt mode
>
> This is the first time I've seen this error. Kernel is based on commit
> f6f7b52e2f61 (just after -rc7) and includes the final versions of the
> patches Rafael posted in this thread [1].
>
> More complete log available on request.

hm, that warning seems to show an ACPI bug (Len Cc:-ed): we preempt in an
atomic section - right during executing an AML scriptlet. Executing ACPI
AMLs is a rather fragile moment of the kernel: they are used by the BIOS
to indirectly instruct the kernel to tweak lowlevel chipset registers and
other platform details.

The kernel executes AMLs 'blindly' - they tweak details that Linux
typically has no knowledge about via any driver - so these things must
absolutely run atomic, and scheduling away in the wrong moment (which
means implicitly re-enabling interrupts) can leave the system in an
inconsistent state.

This 'blindness' and opaqueness of AML execution is perhaps the nastiest
aspect of the whole ACPI engine (because their opacity makes them
undebuggable and unfixable in essence). Nevertheless, it still might be
some unrelated phenomenon to your APIC illegal vector errors.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/