Re: 2.6.32.5 regression: page allocation failure. order:1,

From: Mel Gorman
Date: Fri Jan 29 2010 - 12:34:48 EST


On Fri, Jan 29, 2010 at 05:27:56PM +0000, Hugh Dickins wrote:
> On Fri, 29 Jan 2010, Mel Gorman wrote:
> > On Fri, Jan 29, 2010 at 08:56:20AM -0500, Mark Lord wrote:
> >
> > > I'll leave it running for another day or so, and then perhaps revert
> > > the one patch to see which of the two things (new kernel, or patch)
> > > is responsible for the difference.
> > >
> >
> > Thanks, I'd appreciate it. While I'm reasonably confident the problem is
> > with MIGRATE_RESERVE not being free as intended and that the patch fixes
> > it, I'd like more proof.
>
> You're more confident about that than I am! It will be very satisfying
> if my patch turns out to make the difference, but still surprising to me.
>

My confidence/delusion is in part due to the reasons MIGRATE_RESERVE
exists in the first place. Specifically, certain wireless network drivers
were doing GFP_ATOMIC order-2 allocation a lot and failing miserably when
anti-fragmentation was first introduced. The problem came down to a property
of the buddy allocator implementation that kept min_free_kbytes worth of pages
free at the lower addresses of the zone and this is where order-2 allocations
were being made from and quickly freed meaning the area was generally available
for a lot of time. MIGRATE_RESERVE was introduced to preserve that property
of the buddy allocator and the allocation failure problems went away.

Mark's problem feels very similar to the wireless network drivers problem.

Maybe I'm deluding myself.

> Thank you for taking the time on this, Mark: I too would appreciate it
> if you could later determine whether it's new kernel or patch solving it.
>
> >
> > Hugh, can I get a signed-off-by on that patch please? I can improve the
> > changelog if you like and send it to Andrew for merging if you like.
>
> I was adjusting the changelog and about to send direct to Linus Cc stable
> in a few minutes, since I'm guessing there might be a 33-rc6 today, which
> would be a pity to miss.
>

Great.

> Whatever my reluctance to assume it's the fix to Mark's problem (which I'm
> not mentioning in the changelog), we are both sure it's a valid bugfix.
>

Indeed. Thanks.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/