[PATCH v2] mm: support anonymous stable page

From: Minchan Kim
Date: Fri Nov 11 2016 - 01:02:57 EST


For developemnt for zram-swap asynchronous writeback, I found
strange corruption of compressed page. With investigation, it
reveals currently stable page doesn't support anonymous page.
IOW, reuse_swap_page can reuse the page without waiting
writeback completion so that it can corrupt data during
zram compression. It can affect every swap device which supports
asynchronous writeback and CRC checking as well as zRAM.

Unfortunately, reuse_swap_page should be atomic so that we
cannot wait on writeback in there so the approach in this patch
is simply return false if we found it needs stable page.
Although it increases memory footprint temporarily, it happens
rarely and it should be reclaimed easily althoug it happened.
Also, It would be better than waiting of IO completion, which
is critial path for application latency.

Cc: Hugh Dickins <hughd@xxxxxxxxxx>
Cc: Darrick J. Wong <darrick.wong@xxxxxxxxxx>
Signed-off-by: Minchan Kim <minchan@xxxxxxxxxx>
---
* from v1
* use swap_info_struct instead of swapper_space->host inode - Hugh

include/linux/swap.h | 3 ++-
mm/swapfile.c | 20 +++++++++++++++++++-
2 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index a56523cefb9b..55ff5593c193 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -150,8 +150,9 @@ enum {
SWP_FILE = (1 << 7), /* set after swap_activate success */
SWP_AREA_DISCARD = (1 << 8), /* single-time swap area discards */
SWP_PAGE_DISCARD = (1 << 9), /* freed swap page-cluster discards */
+ SWP_STABLE_WRITES = (1 << 10), /* no overwrite PG_writeback pages */
/* add others here before... */
- SWP_SCANNING = (1 << 10), /* refcount in scan_swap_map */
+ SWP_SCANNING = (1 << 11), /* refcount in scan_swap_map */
};

#define SWAP_CLUSTER_MAX 32UL
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2210de290b54..66bc330c0b65 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -943,11 +943,25 @@ bool reuse_swap_page(struct page *page, int *total_mapcount)
count = page_trans_huge_mapcount(page, total_mapcount);
if (count <= 1 && PageSwapCache(page)) {
count += page_swapcount(page);
- if (count == 1 && !PageWriteback(page)) {
+ if (count != 1)
+ goto out;
+ if (!PageWriteback(page)) {
delete_from_swap_cache(page);
SetPageDirty(page);
+ } else {
+ swp_entry_t entry;
+ struct swap_info_struct *p;
+
+ entry.val = page_private(page);
+ p = swap_info_get(entry);
+ if (p->flags & SWP_STABLE_WRITES) {
+ spin_unlock(&p->lock);
+ return false;
+ }
+ spin_unlock(&p->lock);
}
}
+out:
return count <= 1;
}

@@ -2447,6 +2461,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
error = -ENOMEM;
goto bad_swap;
}
+
+ if (bdi_cap_stable_pages_required(inode_to_bdi(inode)))
+ p->flags |= SWP_STABLE_WRITES;
+
if (p->bdev && blk_queue_nonrot(bdev_get_queue(p->bdev))) {
int cpu;

--
2.7.4