Re: [mm] [PATCH 3/4] Memory cgroup hierarchical reclaim

From: KAMEZAWA Hiroyuki
Date: Wed Nov 05 2008 - 11:20:47 EST


Balbir Singh said:
>>>>> + list_for_each_entry_safe_from(cgroup, cg,
>>>>> &cg_current->parent->children,
>>>>> + sibling) {
>>>>> + mem_child = mem_cgroup_from_cont(cgroup);
>>>>> +
>>>>> + /*
>>>>> + * Move beyond last scanned child
>>>>> + */
>>>>> + if (mem_child == mem->last_scanned_child)
>>>>> + continue;
>>>>> +
>>>>> + ret = try_to_free_mem_cgroup_pages(mem_child, gfp_mask);
>>>>> + mem->last_scanned_child = mem_child;
>>>>> +
>>>>> + if (res_counter_check_under_limit(&mem->res)) {
>>>>> + ret = 0;
>>>>> + goto done;
>>>>> + }
>>>>> + }
>>>> Is this safe against cgroup create/remove ? cgroup_mutex is held ?
>>> Yes, I thought about it, but with the setup, each parent will be busy
>>> since they
>>> have children and hence cannot be removed. The leaf child itself has
>>> tasks, so
>>> it cannot be removed. IOW, it should be safe against removal.
>>>
>> I'm sorry if I misunderstand something. could you explain folloing ?
>>
>> In following tree,
>>
>> level-1
>> - level-2
>> - level-3
>> - level-4
>> level-1's usage = level-1 + level-2 + level-3 + level-4
>> level-2's usage = level-2 + level-3 + level-4
>> level-3's usage = level-3 + level-4
>> level-4's usage = level-4
>>
>> Assume that a task in level-2 hits its limit. It has to reclaim memory
>> from
>> level-2 and level-3, level-4.
>>
>> How can we guarantee level-4 has a task in this case ?
>
> Good question. If there is no task, the LRU's will be empty and reclaim
> will
> return. We could also add other checks if needed.
>
If needed ?, yes, you need.
The problem is that you are walking a list in usual way without any lock
or guarantee that the list will never be modified.

My quick idea is following.
==
Before start reclaim.
1. take lock_cgroup()
2. scan the tree and create "private" list as snapshot of tree to be
scanned.
3. unlock_cgroup().
4. start reclaim.

Adding refcnt to memcg to delay freeing memcg control area is necessary.
(mem+swap controller have function to do this and you may be able to
reuse it.)

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/