[PATCH] mm/memcg: fix refcount error while moving and swapping

From: Hugh Dickins
Date: Tue Jul 07 2020 - 17:38:40 EST


It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.

This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when offlining).

Just skip that optimization: do that part of the accounting immediately.

Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Cc: <stable@xxxxxxxxxxxxxxx>
Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
---
This was frustrating while testing Alex Shi's patches a few weeks
ago, and no fault of those. I may have misattributed the "Fixes",
which was itself fixing an earlier, which were both backported to v3.19;
or maybe it goes back way further than those, I didn't pursue it - not
top of the list of user complaints! Certainly goes back before the
refcount_add() in v4.20, which replaced a VM_BUG_ON(atomic_read <= 0).

mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- 5.8-rc4/mm/memcontrol.c 2020-06-28 15:52:13.360672658 -0700
+++ linux/mm/memcontrol.c 2020-07-05 18:11:51.136542439 -0700
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
if (!mem_cgroup_is_root(mc.to))
page_counter_uncharge(&mc.to->memory, mc.moved_swap);

- mem_cgroup_id_get_many(mc.to, mc.moved_swap);
css_put_many(&mc.to->css, mc.moved_swap);

mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put: /* get_mctgt_type() gets the page
ent = target.ent;
if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
mc.precharge--;
- /* we fixup refcnts and charges later. */
+ mem_cgroup_id_get_many(mc.to, 1);
+ /* we fixup other refcnts and charges later. */
mc.moved_swap++;
}
break;