Re: [-rc7 regression] Buggy commit: "mm: use aligned zone start forpfn_to_bitidx calculation"

From: Linus Torvalds
Date: Sat Feb 16 2013 - 13:26:57 EST


On Fri, Feb 15, 2013 at 3:44 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
>>
>> c060f943d092 may be related as you config does not have
>> CONFIG_SPARSEMEM defined.
>
> Right, that's the commit causing the x86 regression:
>
> c060f943d0929f3e429c5d9522290584f6281d6e is the first bad commit
> commit c060f943d0929f3e429c5d9522290584f6281d6e
> Date: Fri Jan 11 14:31:51 2013 -0800
>
> mm: use aligned zone start for pfn_to_bitidx calculation

Ok, looking more at this, I don't really want to revert it, and I have
an idea of what is wrong.

When we allocate the zone use bitmap, we do not take the
zone_start_pfn into account. So I *think* that what happens is that
"pfn_to_bitidx()" simply overruns the allocation for unaligned zonesm
and the spinlock just happens to be right after (or the overrun causes
some other memory corruption that then indirectly causes the spinlock
corruption).

So I'm wondering if the fix is simply something like the attached
patch. It takes the zone_start_pfn into account when allocating the
zone bitmap.

Laura? Mel?

Ingo, can you test this? I was going to do the 3.8 today, but I guess
I can just wait, and if you can test this we could get it in..

Linus

Attachment: patch.diff
Description: Binary data