Re: [PATCH 1/4] mm: vmscan: Correct check for kswapd sleeping insleeping_prematurely

From: Pádraig Brady
Date: Wed Jun 29 2011 - 06:58:33 EST


On 28/06/11 22:49, Andrew Morton wrote:
> On Fri, 24 Jun 2011 15:44:54 +0100
> Mel Gorman <mgorman@xxxxxxx> wrote:
>
>> During allocator-intensive workloads, kswapd will be woken frequently
>> causing free memory to oscillate between the high and min watermark.
>> This is expected behaviour.
>>
>> A problem occurs if the highest zone is small. balance_pgdat()
>> only considers unreclaimable zones when priority is DEF_PRIORITY
>> but sleeping_prematurely considers all zones. It's possible for this
>> sequence to occur
>>
>> 1. kswapd wakes up and enters balance_pgdat()
>> 2. At DEF_PRIORITY, marks highest zone unreclaimable
>> 3. At DEF_PRIORITY-1, ignores highest zone setting end_zone
>> 4. At DEF_PRIORITY-1, calls shrink_slab freeing memory from
>> highest zone, clearing all_unreclaimable. Highest zone
>> is still unbalanced
>> 5. kswapd returns and calls sleeping_prematurely
>> 6. sleeping_prematurely looks at *all* zones, not just the ones
>> being considered by balance_pgdat. The highest small zone
>> has all_unreclaimable cleared but but the zone is not
>> balanced. all_zones_ok is false so kswapd stays awake
>>
>> This patch corrects the behaviour of sleeping_prematurely to check
>> the zones balance_pgdat() checked.
>
> But kswapd is making progress: it's reclaiming slab. Eventually that
> won't work any more and all_unreclaimable will not be cleared and the
> condition will fix itself up?
>
>
>
> btw,
>
> if (!sleeping_prematurely(...))
> sleep();
>
> hurts my brain. My brain would prefer
>
> if (kswapd_should_sleep(...))
> sleep();
>
> no?
>
>> Reported-and-tested-by: Pádraig Brady <P@xxxxxxxxxxxxxx>
>
> But what were the before-and-after observations? I don't understand
> how this can cause a permanent cpuchew by kswapd.

Context:
http://marc.info/?t=130865025500001&r=1&w=2
https://bugzilla.redhat.com/show_bug.cgi?id=712019

Summary:

This will spin kswapd0 on my SNB laptop with 3GB RAM (with small normal zone):

dd bs=1M count=3000 if=/dev/zero of=spin.test

Basically once a certain amount of data is cached,
kswapd0 will start spinning, until the data
is removed from cache (by `rm spin.test` for example).

cheers,
Pádraig.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/