Re: [PATCH] make try_to_free_pages walk zonelist

From: Andrew Morton
Date: Sat Dec 20 2003 - 20:00:57 EST


Rik van Riel <riel@xxxxxxxxxxx> wrote:
>
> In 2.6.0 both __alloc_pages() and the corresponding
> wakeup_kswapd()s walk all zones in the zone list,
> possibly spanning multiple nodes in a low numa factor
> system like AMD64.
>
> Also, if lower_zone_protection is set in /proc, then
> it may be possible that kswapd never cleans out data
> in zones further down the zonelist and try_to_free_pages
> needs to do that.
>
> However, in 2.6.0 try_to_free_pages() only frees pages
> in the pgdat the first zone in the zonelist belongs to.
>
> This is probably the wrong behaviour, since both the
> page allocator and the kswapd wakeup free things from
> all zones on the zonelist. The following patch makes
> try_to_free_pages() consistent with the allocator, by
> passing the zonelist as an argument and freeing pages
> from all zones in the list.

hm, OK, so this should be a no-op for non-NUMA setups.

Prior to merging such a change it would be nice to have confirmed
before-and-after testing on real NUMA hardware. Rather than saying "gee,
this should help", but never knowing how much it really did help.

Could you suggest a suitable worklaod for Andi and Martin to test sometime?

Meanwhile, I'll mmify it for a bit of soak testing, thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/