Re: [-rc7 regression] Buggy commit: "mm: use aligned zone start forpfn_to_bitidx calculation"

From: Ingo Molnar
Date: Mon Feb 18 2013 - 03:49:47 EST

* Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> > Right, that's the commit causing the x86 regression:
> >
> > c060f943d0929f3e429c5d9522290584f6281d6e is the first bad commit
> > commit c060f943d0929f3e429c5d9522290584f6281d6e
> > Date: Fri Jan 11 14:31:51 2013 -0800
> >
> > mm: use aligned zone start for pfn_to_bitidx calculation
> Ok, looking more at this, I don't really want to revert it,
> and I have an idea of what is wrong.
> When we allocate the zone use bitmap, we do not take the
> zone_start_pfn into account. So I *think* that what happens is
> that "pfn_to_bitidx()" simply overruns the allocation for
> unaligned zonesm and the spinlock just happens to be right
> after (or the overrun causes some other memory corruption that
> then indirectly causes the spinlock corruption).
> So I'm wondering if the fix is simply something like the
> attached patch. It takes the zone_start_pfn into account when
> allocating the zone bitmap.
> Laura? Mel?
> Ingo, can you test this? I was going to do the 3.8 today, but
> I guess I can just wait, and if you can test this we could get
> it in..

Yes, your patch fixes the bug: with the patch applied to
f741656d646f plus the failing .config the system booted up
just fine.

I also double checked that vanilla upstream f741656d646f still
locks up - so it's your patch that made the difference.

Tested-by: Ingo Molnar <mingo@xxxxxxxxxx>


