Re: percpu crash on NetBurst

From: Avi Kivity
Date: Wed Sep 21 2011 - 10:28:39 EST


On 08/08/2011 12:55 PM, Tejun Heo wrote:
Hello, Avi.

On Sun, Aug 07, 2011 at 06:32:35PM +0300, Avi Kivity wrote:
> qemu, under some conditions (-cpu host or -cpu kvm64), erroneously
> passes family=15 as the virtual cpuid. This causes a BUG() in
> percpu code during late boot:
>
> ------------[ cut here ]------------
> kernel BUG at mm/percpu.c:577!


<snip>

> All this applies to v3.0; current upstream (c2f340a69ca) fails even
> worse, haven't yet determined exactly why.
>
> I'm surprised this hasn't been reported before; Ingo, don't you have
> family=15 hosts in your test farm?

Hmmm... I can't trigger the problem w/ kvm64 (I tried mounting and
unmounting filesystems but it worked okay) and am quite skeptical this
is a wide spread problem given that the percpu core code is used very
widely and hasn't seen a lot of changes lately. Is there anything
specific you need to do to trigger the condition? Can you try to
print out the s_files addresses being allocated and freed?


Coming back to this, the trigger if cpuid family=6 and model>=13 (model 12 works). Looks like the code disables rep_good is some MSR doesn't have the expected value. While we should configure the MSR correctly, it looks like the fallback code for !rep_good is broken. Will look further.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/