Re: [RFC] mm: remove swapcache page early

From: Minchan Kim
Date: Mon Apr 01 2013 - 22:04:35 EST


Hi Hugh,

On Fri, Mar 29, 2013 at 01:01:14PM -0700, Hugh Dickins wrote:
> On Fri, 29 Mar 2013, Minchan Kim wrote:
> > On Thu, Mar 28, 2013 at 11:19:12AM -0700, Dan Magenheimer wrote:
> > > > From: Minchan Kim [mailto:minchan@xxxxxxxxxx]
> > > > On Wed, Mar 27, 2013 at 03:24:00PM -0700, Dan Magenheimer wrote:
> > > > > > From: Hugh Dickins [mailto:hughd@xxxxxxxxxx]
> > > > > > Subject: Re: [RFC] mm: remove swapcache page early
> > > > > >
> > > > > > I believe the answer is for frontswap/zmem to invalidate the frontswap
> > > > > > copy of the page (to free up the compressed memory when possible) and
> > > > > > SetPageDirty on the PageUptodate PageSwapCache page when swapping in
> > > > > > (setting page dirty so nothing will later go to read it from the
> > > > > > unfreed location on backing swap disk, which was never written).
> > > > >
> > > > > There are two duplication issues: (1) When can the page be removed
> > > > > from the swap cache after a call to frontswap_store; and (2) When
> > > > > can the page be removed from the frontswap storage after it
> > > > > has been brought back into memory via frontswap_load.
> > > > >
> > > > > This patch from Minchan addresses (1). The issue you are raising
> > > >
> > > > No. I am addressing (2).
> > > >
> > > > > here is (2). You may not know that (2) has recently been solved
> > > > > in frontswap, at least for zcache. See frontswap_exclusive_gets_enabled.
> > > > > If this is enabled (and it is for zcache but not yet for zswap),
> > > > > what you suggest (SetPageDirty) is what happens.
> > > >
> > > > I am blind on zcache so I didn't see it. Anyway, I'd like to address it
> > > > on zram and zswap.
> > >
> > > Zswap can enable it trivially by adding a function call in init_zswap.
> > > (Note that it is not enabled by default for all frontswap backends
> > > because it is another complicated tradeoff of cpu time vs memory space
> > > that needs more study on a broad set of workloads.)
> > >
> > > I wonder if something like this would have a similar result for zram?
> > > (Completely untested... snippet stolen from swap_entry_free with
> > > SetPageDirty added... doesn't compile yet, but should give you the idea.)
>
> Thanks for correcting me on zram (in earlier mail of this thread), yes,
> I was forgetting about the swap_slot_free_notify entry point which lets
> that memory be freed.
>
> >
> > Nice idea!
> >
> > After I see your patch, I realized it was Hugh's suggestion and
> > you implemented it in proper place.
> >
> > Will resend it after testing. Maybe nextweek.
> > Thanks!
>
> Be careful, although Dan is right that something like this can be
> done for zram, I believe you will find that it needs a little more:
> either a separate new entry point (not my preference) or a flags arg
> (or boolean) added to swap_slot_free_notify.
>
> Because this is a different operation: end_swap_bio_read() wants
> to free up zram's compressed copy of the page, but the swp_entry_t
> must remain valid until swap_entry_free() can clear up the rest.
> Precisely how much of the work each should do, you will discover.

First of all, Thanks for noticing it for me!

If I parse your concern correctly, you are concerning about
different semantic on two functions.
(end_swap_bio_read's swap_slot_free_notify VS swap_entry_free's one).

But current implementatoin on zram_slot_free_notify could cover both cases
properly with luck.

zram_free_page caused by end_swap_bio_read will free compressed copy
of the page and zram_free_page caused by swap_entry_free later won't find
right index from zram->table and just return.
So I think there is no problem.

Remained problem is zram->stats.notify_free, which could be counted
redundantly but not sure it's valuable to count exactly.

If I miss your point, please pinpoint your concern. :)

Thanks!
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/