Re: [patch] x86: 2.6.31-rc7 crash due to buggy flat_phys_pkg_id

From: Ingo Molnar
Date: Tue Aug 25 2009 - 15:14:01 EST



* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Tue, 25 Aug 2009, Yinghai Lu wrote:
> >
> > initial apic id and apic id could be different.
> >
> > and we should use initial apic id to get correct phys pkg id in
> > case BIOS set crazy apic id.
>
> Yinghai - I think you missed Cyrills' point. Let me repeat it:
>
> "cpu_has_apic bit turned off"
>
> there's no apic. No "initial apic id". No "phys pkg id". No
> nothing.
>
> Discussions about "correct phys pkg id" are pointless.

that's not the case here though:

[ 8.713916] Total of 32 processors activated (162314.96 BogoMIPS).

so APICs are active. The real difference is i think this aspect of
commit 2759c3287:

static int flat_phys_pkg_id(int initial_apic_id, int index_msb)
{
- return hard_smp_processor_id() >> index_msb;
+ return initial_apic_id >> index_msb;
}

We need to revert back to .30 behavior here. (In case of which
environment to trust we almost always trust whatever booted millions
of Linux boxes in the past already.)

Furthermore, commit 2759c3287 did not declare any side-effects and
clearly causes a side-effect on vSMP which apparently has an
overlapping set of initial APIC ids.

Ravikiran, your patch does not do a clear revert of this bit though.
If you do a plain revert of the line above alone, does that fix the
problem too?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/