Re: [PATCH RFC 00/15] mm: memory book keeping and lru_lock splitting

From: Konstantin Khlebnikov
Date: Sat Feb 18 2012 - 01:36:12 EST


Hugh Dickins wrote:
On Thu, 16 Feb 2012, Hugh Dickins wrote:

Yours are not the only patches I was testing in that tree, I tried to
gather several other series which I should be reviewing if I ever have
time: Kamezawa-san's page cgroup diet 6, Xiao Guangrong's 4 prio_tree
cleanups, your 3 radix_tree changes, your 6 shmem changes, your 4 memcg
miscellaneous, and then your 15 books.

The tree before your final 15 did well under pressure, until I tried to
rmdir one of the cgroups afterwards: then it crashed nastily, I'll have
to bisect into that, probably either Kamezawa's or your memcg changes.

So far I haven't succeeded in reproducing that at all: it was real,
but obviously harder to get than I assumed - indeed, no good reason
to associate it with any of those patches, might even be in 3.3-rc.

It did involve a NULL pointer dereference in mem_cgroup_page_lruvec(),
somewhere below compact_zone() - but repercussions were causing the
stacktrace to scroll offscreen, so I didn't get good details.

There some stupid bugs in my v1 patchset, it shouldn't works at all.
I did not expect that someone will try to use it. I sent it just to discuss.

Most destructive bug is this PageCgroupUsed() below:

+struct book *page_book(struct page *page)
+{
+ struct mem_cgroup_per_zone *mz;
+ struct page_cgroup *pc;
+
+ if (mem_cgroup_disabled())
+ return &page_zone(page)->book;
+
+ pc = lookup_page_cgroup(page);
+ if (!PageCgroupUsed(pc))
+ return &page_zone(page)->book;
+ /* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
+ smp_rmb();
+ mz = mem_cgroup_zoneinfo(pc->mem_cgroup,
+ page_to_nid(page), page_zonenum(page));
+ return &mz->book;
+}

Thus after page uncharge I remove page from wrong book, under wrong lock =)

[ as I wrote, updated patchset there: https://github.com/koct9i/linux ]


Hugh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/