Re: [PATCH 6/11] ksm: remove old stable nodes more thoroughly

From: Hugh Dickins
Date: Sun Jan 27 2013 - 18:05:41 EST


On Sat, 26 Jan 2013, Simon Jeons wrote:
> On Fri, 2013-01-25 at 18:01 -0800, Hugh Dickins wrote:
> > Switching merge_across_nodes after running KSM is liable to oops on stale
> > nodes still left over from the previous stable tree. It's not something
> > that people will often want to do, but it would be lame to demand a reboot
> > when they're trying to determine which merge_across_nodes setting is best.
> >
> > How can this happen? We only permit switching merge_across_nodes when
> > pages_shared is 0, and usually set run 2 to force that beforehand, which
> > ought to unmerge everything: yet oopses still occur when you then run 1.
> >
> > Three causes:
> >
> > 1. The old stable tree (built according to the inverse merge_across_nodes)
> > has not been fully torn down. A stable node lingers until get_ksm_page()
> > notices that the page it references no longer references it: but the page
> > is not necessarily freed as soon as expected, particularly when swapcache.
> >
>
> When can this happen?

Whenever there's an additional reference to the page, beyond those for
its ptes in userspace - swapcache for example, or pinned by get_user_pages.
That delays its being freed (arriving at the "page->mapping = NULL;"
in free_pages_prepare()). Or it might simply be sitting in a pagevec,
waiting for that to be filled up, to be freed as part of a batch.

>
> > Fix this with a pass through the old stable tree, applying get_ksm_page()
> > to each of the remaining nodes (most found stale and removed immediately),
> > with forced removal of any left over. Unless the page is still mapped:
> > I've not seen that case, it shouldn't occur, but better to WARN_ON_ONCE
> > and EBUSY than BUG.
> >
> > 2. __ksm_enter() has a nice little optimization, to insert the new mm
> > just behind ksmd's cursor, so there's a full pass for it to stabilize
> > (or be removed) before ksmd addresses it. Nice when ksmd is running,
> > but not so nice when we're trying to unmerge all mms: we were missing
> > those mms forked and inserted behind the unmerge cursor. Easily fixed
> > by inserting at the end when KSM_RUN_UNMERGE.
>
> mms forked will be unmerged just after ksmd's cursor since they're
> inserted behind it, why will be missing?

unmerge_and_remove_all_rmap_items() makes one pass through the list
from start to finish: insert behind the cursor and it will be missed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/