Re: [PATCH v2] mm: support anonymous stable page

From: Minchan Kim
Date: Mon Nov 21 2016 - 23:43:39 EST


Hi Hugh,

On Mon, Nov 21, 2016 at 07:46:28PM -0800, Hugh Dickins wrote:
> On Mon, 21 Nov 2016, Minchan Kim wrote:
> > From: Minchan Kim <minchan@xxxxxxxxxx>
> > Date: Fri, 11 Nov 2016 15:02:57 +0900
> > Subject: [PATCH v2] mm: support anonymous stable page
> >
> > For developemnt for zram-swap asynchronous writeback, I found
> > strange corruption of compressed page. With investigation, it
> > reveals currently stable page doesn't support anonymous page.
> > IOW, reuse_swap_page can reuse the page without waiting
> > writeback completion so that it can corrupt data during
> > zram compression. It can affect every swap device which supports
> > asynchronous writeback and CRC checking as well as zRAM.
> >
> > Unfortunately, reuse_swap_page should be atomic so that we
> > cannot wait on writeback in there so the approach in this patch
> > is simply return false if we found it needs stable page.
> > Although it increases memory footprint temporarily, it happens
> > rarely and it should be reclaimed easily althoug it happened.
> > Also, It would be better than waiting of IO completion, which
> > is critial path for application latency.
> >
> > Cc: Hugh Dickins <hughd@xxxxxxxxxx>
> > Cc: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
> > Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
>
> Acked-by: Hugh Dickins <hughd@xxxxxxxxxx>

Thanks!

>
> Looks good, thanks: we can always optimize away that little overhead
> in the PageWriteback case, if it ever shows up in someone's testing.

Yeb.

>
> Andrew might ask if we should Cc stable (haha): I think we agree
> that it's a defect we've been aware of ever since stable pages were
> first proposed, but nobody has actually been troubled by it before
> your async zram development: so, you're right to be fixing it ahead
> of your zram changes, but we don't see a call for backporting.

I thought so until I see your comment. However, I checked again
and found it seems a ancient bug since zram birth.
swap_writepage unlock the page right before submitting bio while
it keeps the lock during rw_page operation during bdev_write_page.
So, if zram_rw_page fails(e.g, -ENOMEM) and then fallback to
submit_bio in __swap_writepage, the problem can occur.

Hmm, I will resend patchset with zram fix part with marking
the stable.

Thanks, Hugh!