[PATCH 1/2] mm/khugepaged: set THP as uptodate earlier for shmem

From: David Stevens
Date: Tue Feb 14 2023 - 02:57:40 EST


From: David Stevens <stevensd@xxxxxxxxxxxx>

In collapse_file, mark the THP as up-to-date before inserting it into
the page cache. This fixes a race where folio_seek_hole_data would
mistake the THP for an fallocated but unwritten page. This race is
visible to userspace via data temporarily disappearing from
SEEK_DATA/SEEK_HOLE, which can cause data loss for applications that use
lseek to efficiently snapshot sparse shmem.

Fixes: f3f0e1d2150b ("khugepaged: add support of collapse for tmpfs/shmem pages")
Signed-off-by: David Stevens <stevensd@xxxxxxxxxxxx>
---
mm/khugepaged.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 79be13133322..b648f1053d95 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1779,10 +1779,13 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
hpage->mapping = mapping;

/*
- * At this point the hpage is locked and not up-to-date.
- * It's safe to insert it into the page cache, because nobody would
- * be able to map it or use it in another way until we unlock it.
+ * Mark hpage as up-to-date before inserting it into the page cache to
+ * prevent it from being mistaken for an fallocated but unwritten page.
+ * Inserting the unfinished hpage into the page cache is safe because
+ * it is locked, so nobody can map it or use it in another way until we
+ * unlock it.
*/
+ SetPageUptodate(hpage);

xas_set(&xas, start);
for (index = start; index < end; index++) {
--
2.39.1.581.gbfd45094c4-goog