Re: [problem] raid performance loss with 2.6.26-rc8 on 32-bit x86 (bisected)

From: Mel Gorman
Date: Thu Jul 03 2008 - 12:38:50 EST


On (02/07/08 22:54), Dan Williams didst pronounce:
>
> On Wed, 2008-07-02 at 22:00 -0700, Mel Gorman wrote:
>
> > Subject: [PATCH] Do not overwrite nr_zones on !NUMA when initialising zlcache_ptr
> >
> > With the two-zonelist patches on !NUMA machines, there really is only one
> > zonelist as __GFP_THISNODE is meaningless. However, during initialisation, the
> > assumption is made that two zonelists exist when initialising zlcache_ptr. The
> > result is that pgdat->nr_zones is always 0. As kswapd uses this value to
> > determine what reclaim work is necessary, the result is that kswapd never
> > reclaims. This causes processes to stall frequently in low-memory situations
> > as they always direct reclaim. This patch initialises zlcache_ptr correctly.
> >
> > Signed-off-by: Mel Gorman <mel@xxxxxxxxx>
> > ---
> > page_alloc.c | 1 -
> > 1 file changed, 1 deletion(-)
> >
> > diff -rup -X /usr/src/patchset-0.6/bin//dontdiff linux-2.6.26-rc8-clean/mm/page_alloc.c linux-2.6.26-rc8-fix-kswapd-on-numa/mm/page_alloc.c
> > --- linux-2.6.26-rc8-clean/mm/page_alloc.c 2008-06-24 18:58:20.000000000 -0700
> > +++ linux-2.6.26-rc8-fix-kswapd-on-numa/mm/page_alloc.c 2008-07-02 21:49:09.000000000 -0700
> > @@ -2328,7 +2328,6 @@ static void build_zonelists(pg_data_t *p
> > static void build_zonelist_cache(pg_data_t *pgdat)
> > {
> > pgdat->node_zonelists[0].zlcache_ptr = NULL;
> > - pgdat->node_zonelists[1].zlcache_ptr = NULL;
> > }
> >
> > #endif /* CONFIG_NUMA */
> >
>
> Bug squished.
>
> # for i in `seq 1 5`; do dd if=/dev/zero of=/dev/md0 bs=1024k count=2048; done
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 7.73352 s, 278 MB/s
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 7.6845 s, 279 MB/s
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 7.74428 s, 277 MB/s
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 7.65959 s, 280 MB/s
> 2048+0 records in
> 2048+0 records out
> 2147483648 bytes (2.1 GB) copied, 7.73107 s, 278 MB/s
>
> Tested-by: Dan Williams <dan.j.williams@xxxxxxxxx>
>

Great news. Dan, thanks a lot for reporting and persisting with the testing
of various bits and pieces to get this pinned down. It is greatly appreciated.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/