Re: CMA broken in next-20120926

From: Thierry Reding
Date: Fri Sep 28 2012 - 06:51:19 EST


On Fri, Sep 28, 2012 at 12:38:15PM +0200, Thierry Reding wrote:
> On Fri, Sep 28, 2012 at 12:32:07PM +0200, Thierry Reding wrote:
> > On Fri, Sep 28, 2012 at 11:27:28AM +0100, Mel Gorman wrote:
> > > On Fri, Sep 28, 2012 at 11:48:25AM +0300, Peter Ujfalusi wrote:
> > > > Hi,
> > > >
> > > > On 09/28/2012 11:37 AM, Mel Gorman wrote:
> > > > >> I hope this patch fixes the bug. If this patch fixes the problem
> > > > >> but has some problem about description or someone has better idea,
> > > > >> feel free to modify and resend to akpm, Please.
> > > > >>
> > > > >
> > > > > A full revert is overkill. Can the following patch be tested as a
> > > > > potential replacement please?
> > > > >
> > > > > ---8<---
> > > > > mm: compaction: Iron out isolate_freepages_block() and isolate_freepages_range() -fix1
> > > > >
> > > > > CMA is reported to be broken in next-20120926. Minchan Kim pointed out
> > > > > that this was due to nr_scanned != total_isolated in the case of CMA
> > > > > because PageBuddy pages are one scan but many isolations in CMA. This
> > > > > patch should address the problem.
> > > > >
> > > > > This patch is a fix for
> > > > > mm-compaction-acquire-the-zone-lock-as-late-as-possible-fix-2.patch
> > > > >
> > > > > Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
> > > >
> > > > linux-next + this patch alone also works for me.
> > > >
> > > > Tested-by: Peter Ujfalusi <peter.ujfalusi@xxxxxx>
> > >
> > > Thanks Peter. I expect it also works for Thierry as I expect you were
> > > suffering the same problem but obviously confirmation of that would be nice.
> >
> > I've been running a few tests and indeed this solves the obvious problem
> > that the coherent pool cannot be created at boot (which in turn caused
> > the ethernet adapter to fail on Tegra).
> >
> > However I've been working on the Tegra DRM driver, which uses CMA to
> > allocate large chunks of framebuffer memory and these are now failing.
> > I'll need to check if Minchan's patch solves that problem as well.
>
> Indeed, with Minchan's patch the DRM can allocate the framebuffer
> without a problem. Something else must be wrong then.

However, depending on the size of the allocation it also happens with
Minchan's patch. What I see is this:

[ 60.736729] alloc_contig_range test_pages_isolated(1e900, 1f0e9) failed
[ 60.743572] alloc_contig_range test_pages_isolated(1ea00, 1f1e9) failed
[ 60.750424] alloc_contig_range test_pages_isolated(1ea00, 1f2e9) failed
[ 60.757239] alloc_contig_range test_pages_isolated(1ec00, 1f3e9) failed
[ 60.764066] alloc_contig_range test_pages_isolated(1ec00, 1f4e9) failed
[ 60.770893] alloc_contig_range test_pages_isolated(1ec00, 1f5e9) failed
[ 60.777698] alloc_contig_range test_pages_isolated(1ec00, 1f6e9) failed
[ 60.784526] alloc_contig_range test_pages_isolated(1f000, 1f7e9) failed
[ 60.791148] drm tegra: Failed to alloc buffer: 8294400

I'm pretty sure this did work before next-20120926.

Thierry

Attachment: pgp00000.pgp
Description: PGP signature