Re: [PATCH 3.3] memcg: fix deadlock by inverting lrucare nesting

From: Hugh Dickins
Date: Wed Feb 29 2012 - 19:43:41 EST


On Wed, 29 Feb 2012, Andrew Morton wrote:
> On Tue, 28 Feb 2012 21:25:02 -0800 (PST)
> Hugh Dickins <hughd@xxxxxxxxxx> wrote:
>
> > We have forgotten the rules of lock nesting: the irq-safe ones must be
> > taken inside the non-irq-safe ones, otherwise we are open to deadlock:
>
> This patch makes rather a mess of "memcg: remove PCG_CACHE page_cgroup
> flag".

Sorry about that.

>
> I did it this way:

Exactly right, thank you. In my tree I end up with a blank line
in between the smp_wmb() and the SetPageCgroupUsed(pc), but I
prefer the way you have grouped it.

Hugh

>
> static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
> struct page *page,
> unsigned int nr_pages,
> struct page_cgroup *pc,
> enum charge_type ctype,
> bool lrucare)
> {
> struct zone *uninitialized_var(zone);
> bool was_on_lru = false;
> bool anon;
>
> lock_page_cgroup(pc);
> if (unlikely(PageCgroupUsed(pc))) {
> unlock_page_cgroup(pc);
> __mem_cgroup_cancel_charge(memcg, nr_pages);
> return;
> }
> /*
> * we don't need page_cgroup_lock about tail pages, becase they are not
> * accessed by any other context at this point.
> */
>
> /*
> * In some cases, SwapCache and FUSE(splice_buf->radixtree), the page
> * may already be on some other mem_cgroup's LRU. Take care of it.
> */
> if (lrucare) {
> zone = page_zone(page);
> spin_lock_irq(&zone->lru_lock);
> if (PageLRU(page)) {
> ClearPageLRU(page);
> del_page_from_lru_list(zone, page, page_lru(page));
> was_on_lru = true;
> }
> }
>
> pc->mem_cgroup = memcg;
> /*
> * We access a page_cgroup asynchronously without lock_page_cgroup().
> * Especially when a page_cgroup is taken from a page, pc->mem_cgroup
> * is accessed after testing USED bit. To make pc->mem_cgroup visible
> * before USED bit, we need memory barrier here.
> * See mem_cgroup_add_lru_list(), etc.
> */
> smp_wmb();
> SetPageCgroupUsed(pc);
>
> if (lrucare) {
> if (was_on_lru) {
> VM_BUG_ON(PageLRU(page));
> SetPageLRU(page);
> add_page_to_lru_list(zone, page, page_lru(page));
> }
> spin_unlock_irq(&zone->lru_lock);
> }
>
> if (ctype == MEM_CGROUP_CHARGE_TYPE_MAPPED)
> anon = true;
> else
> anon = false;
>
> mem_cgroup_charge_statistics(memcg, anon, nr_pages);
> unlock_page_cgroup(pc);
>
> /*
> * "charge_statistics" updated event counter. Then, check it.
> * Insert ancestor (and ancestor's ancestors), to softlimit RB-tree.
> * if they exceeds softlimit.
> */
> memcg_check_events(memcg, page);
> }
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/