Re: [PATCH 3/3][RESEND] Provide control over unmapped pages (v4)

From: Balbir Singh
Date: Wed Feb 09 2011 - 06:16:30 EST


On 02/09/2011 05:27 AM, Andrew Morton wrote:
> On Tue, 01 Feb 2011 22:25:45 +0530
> Balbir Singh <balbir@xxxxxxxxxxxxxxxxxx> wrote:
>
>> Changelog v4
>> 1. Add max_unmapped_ratio and use that as the upper limit
>> to check when to shrink the unmapped page cache (Christoph
>> Lameter)
>>
>> Changelog v2
>> 1. Use a config option to enable the code (Andrew Morton)
>> 2. Explain the magic tunables in the code or at-least attempt
>> to explain them (General comment)
>> 3. Hint uses of the boot parameter with unlikely (Andrew Morton)
>> 4. Use better names (balanced is not a good naming convention)
>>
>> Provide control using zone_reclaim() and a boot parameter. The
>> code reuses functionality from zone_reclaim() to isolate unmapped
>> pages and reclaim them as a priority, ahead of other mapped pages.
>> A new sysctl for max_unmapped_ratio is provided and set to 16,
>> indicating 16% of the total zone pages are unmapped, we start
>> shrinking unmapped page cache.
>
> We'll need some documentation for sysctl_max_unmapped_ratio, please.
> In Documentation/sysctl/vm.txt, I suppose.
>
> It will be interesting to find out what this ratio refers to. it
> apears to be a percentage. We've had problem in the past where 1% was
> way too much and we had to change the kernel to provide much
> finer-grained control.
>

Sure, I'll update the Documentation as a part of this patchset. Yes, the current
min_unmapped_ratio is a percentage and so is max_unmapped_ratio. min_unmapped_ratio
already exists, adding max_ should not affect granularity of control. It
will be worth relooking at the granularity based on user feedback and
experience. We won't break ABI if we add additional interfaces to help
granularity.


>>
>> ...
>>
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -306,7 +306,10 @@ struct zone {
>> /*
>> * zone reclaim becomes active if more unmapped pages exist.
>> */
>> +#if defined(CONFIG_UNMAPPED_PAGE_CONTROL) || defined(CONFIG_NUMA)
>> unsigned long min_unmapped_pages;
>> + unsigned long max_unmapped_pages;
>> +#endif
>
> This change breaks the connection between min_unmapped_pages and its
> documentation, and fails to document max_unmapped_pages.
>

I'll fix that

> Also, afacit if CONFIG_NUMA=y and CONFIG_UNMAPPED_PAGE_CONTROL=n,
> max_unmapped_pages will be present in the kernel image and will appear
> in /proc but it won't actually do anything. Seems screwed up and
> misleading.
>


Good catch! In one of the emails Christoph mentioned that max_unmapped_ratio
might be helpful even in the general case (but we need to work on that later).
For now, I'll fix this and repose.


>> ...
>>
>> +#if defined(CONFIG_UNMAPPED_PAGECACHE_CONTROL)
>> +/*
>> + * Routine to reclaim unmapped pages, inspired from the code under
>> + * CONFIG_NUMA that does unmapped page and slab page control by keeping
>> + * min_unmapped_pages in the zone. We currently reclaim just unmapped
>> + * pages, slab control will come in soon, at which point this routine
>> + * should be called reclaim cached pages
>> + */
>> +unsigned long reclaim_unmapped_pages(int priority, struct zone *zone,
>> + struct scan_control *sc)
>> +{
>> + if (unlikely(unmapped_page_control) &&
>> + (zone_unmapped_file_pages(zone) > zone->min_unmapped_pages)) {
>> + struct scan_control nsc;
>> + unsigned long nr_pages;
>> +
>> + nsc = *sc;
>> +
>> + nsc.swappiness = 0;
>> + nsc.may_writepage = 0;
>> + nsc.may_unmap = 0;
>> + nsc.nr_reclaimed = 0;
>> +
>> + nr_pages = zone_unmapped_file_pages(zone) -
>> + zone->min_unmapped_pages;
>> + /*
>> + * We don't want to be too aggressive with our
>> + * reclaim, it is our best effort to control
>> + * unmapped pages
>> + */
>> + nr_pages >>= 3;
>> +
>> + zone_reclaim_pages(zone, &nsc, nr_pages);
>> + return nsc.nr_reclaimed;
>> + }
>> + return 0;
>> +}
>
> This returns an undocumented ulong which is never used by callers.
>

Good catch! I;ll remove the return value, I don't expect it to be used
to check how much we could reclaim.

Thanks for the review!

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/