Re: [RFC][PATCH 5/9] mm/migrate: demote pages during reclaim

From: Dave Hansen
Date: Thu Aug 20 2020 - 11:22:10 EST


On 8/20/20 1:06 AM, Huang, Ying wrote:
>> + /* Migrate pages selected for demotion */
>> + nr_reclaimed += demote_page_list(&ret_pages, &demote_pages, pgdat, sc);
>> +
>> pgactivate = stat->nr_activate[0] + stat->nr_activate[1];
>>
>> mem_cgroup_uncharge_list(&free_pages);
>> _
> Generally, it's good to batch the page migration. But one side effect
> is that, if the pages are failed to be migrated, they will be placed
> back to the LRU list instead of falling back to be reclaimed really.
> This may cause some issue in some situation. For example, if there's no
> enough space in the PMEM (slow) node, so the page migration fails, OOM
> may be triggered, because the direct reclaiming on the DRAM (fast) node
> may make no progress, while it can reclaim some pages really before.

Yes, agreed.

There are a couple of ways we could fix this. Instead of splicing
'demote_pages' back into 'ret_pages', we could try to get them back on
'page_list' and goto the beginning on shrink_page_list(). This will
probably yield the best behavior, but might be a bit ugly.

We could also add a field to 'struct scan_control' and just stop trying
to migrate after it has failed one or more times. The trick will be
picking a threshold that doesn't mess with either the normal reclaim
rate or the migration rate.

This is on my list to fix up next.