Re: WARNING: at arch/x86/kernel/smpboot.c:310topology_sane.clone.1+0x6e/0x81()

From: Andreas Herrmann
Date: Tue May 29 2012 - 11:30:23 EST


On Tue, May 29, 2012 at 04:51:46PM +0200, Peter Zijlstra wrote:
> On Tue, 2012-05-29 at 15:54 +0200, Borislav Petkov wrote:
> > Dudes,
> >
> > I'm getting the warning below on current linus. AFAICT, it is caused by
> >
> > static bool __cpuinit match_mc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
> > {
> > if (c->phys_proc_id == o->phys_proc_id)
> > return topology_sane(c, o, "mc");
> >
> > return false;
> > }
> >
> > and the reason is, IMHO, that because this is a MCM box which has two
> > nodes in one physical package, i.e., phys_proc_id is 0 on both CPU6 and
> > CPU0 but it has two internal nodes, 0 and 1 and CPUs 0-5 are on node 0
> > and CPUs 6-11 are on node 1, the warning fires.
> >
> > Maybe we could do something like this untested hunk:
> >
> > diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> > index 433529e29be4..e52538cd48bb 100644
> > --- a/arch/x86/kernel/smpboot.c
> > +++ b/arch/x86/kernel/smpboot.c
> > @@ -348,7 +348,8 @@ static bool __cpuinit match_llc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
> > static bool __cpuinit match_mc(struct cpuinfo_x86 *c, struct cpuinfo_x86 *o)
> > {
> > if (c->phys_proc_id == o->phys_proc_id)
> > - return topology_sane(c, o, "mc");
> > + if (!cpu_has(c, X86_FEATURE_AMD_DCM))
> > + return topology_sane(c, o, "mc");
> >
> > return false;
> > }
> >
> > or you have a better idea...?
>
> Ah,.. uhm.. unfortunate this... we only seem to use cpu_core_mask for
> topology_core_cpumask() and its purpose is to enumerate cores in a
> package for some very limited generic functions.
>
> Its a bit sad we defined it thus, the multi-core concept only really
> make sense if you share caches, otherwise its just smp.
>
> Also, our generic topology as defined doesn't match nodes. Which is
> weird to say the least.
>
> I'd almost be tempted to say you should fake phys_id, but I can only
> imagine what all would explode if we'd do that :-)
>
> Yeah, I guess we should do the thing you propose, unless someone else
> has a sane idea?

I've also looked at this. core_siblings mask is broken with this patch.
And there is this new irritating warning ...

I second Boris' suggestion for a fix. But I think the check for
X86_FEATURE_AMD_DCM should go into topology_sane() which in theory
could check other things as well.


Andreas


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/