Re: [PATCH 0/3] ksm: write protect pages from inside ksm

From: Hugh Dickins
Date: Sun Jun 14 2009 - 17:49:05 EST


On Sat, 13 Jun 2009, Izik Eidus wrote:
> Hugh, so untill here we are sync,

Yes, that fits with what I have here, thanks (or where it didn't
quite fit, e.g. ' versus `, I've adjusted to what you have!). And
thanks for fixing my *orig_pte = *ptep bug, you did point that out
before, but I misunderstood at first.

>
> Question is what you want me to do now?,
> (Beacuse we are skipping 2.6.31, It is ok to you to tell me something
> like: "Shut up and let me see what i can get with this madvise" -
> that from one side.
> From another side if you want me to do anything please say.

I had to get a bit further at my end before answering on that,
but now the answer is clear: please do some testing of your RFC
madvise() version (which is what I'm just tidying up a little),
and let me know any bugfixes you find. Try with SLAB or SLUB or
SLQB debug on e.g. CONFIG_SLUB=y, CONFIG_SLUB_DEBUG=y and boot
option "slub_debug".

I'm finding, whether with your RFC or my tidyup, that kksmd
soon oopses in get_next_mmlist (or perhaps find_vma): presumably
accessing a vma or mm which already got freed (if you don't have
slab debugging on, it's liable to hang instead).

(I've also not seen it actually merging yet: if you register
or madvise a large anon area and memset it, the /dev/ksm version
would merge all its pages, but I've not seen the madvise version
do so yet - though maybe there's something stupidly wrong in my
testing, really I'm more worried about the oopses at present.)

Note that mmotm includes a patch of Nick's which adds a function
madvise_behavior_valid() - you'll need to add your MADVs into its
list to get it to work at all there.

Here's a patch I added a month or so ago, when trying to experiment
with KSM on all mms: shouldn't be necessary if your mm refcounting
is right, but might help to avoid extra weirdness when things go
wrong: exit_mmap() leaves stale vma pointers around, reckoning
that nobody can be interested by now; but maybe KSM might peep
so better to tidy them up at least while debugging...

Thanks,
Hugh

--- old/mm/mmap.c 2009-05-01 13:47:45.000000000 +0100
+++ new/mm/mmap.c 2009-05-03 11:34:47.000000000 +0100
@@ -2112,6 +2112,14 @@ void exit_mmap(struct mm_struct *mm)
tlb_finish_mmu(tlb, 0, end);

/*
+ * Make sure get_user_pages() and find_vma() etc. will find nothing:
+ * this may be necessary for KSM.
+ */
+ mm->mmap = NULL;
+ mm->mmap_cache = NULL;
+ mm->mm_rb = RB_ROOT;
+
+ /*
* Walk the list again, actually closing and freeing it,
* with preemption enabled, without holding any MM locks.
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/