[PATCH 3/3] Fix fsync-vs-write misbehavior

From: Mikulas Patocka
Date: Sun Oct 05 2008 - 18:17:47 EST


Fix violation of sync()/fsync() semantics. Previous code walked up to
mapping->nrpages * 2 pages. Because pages could be created while
__filemap_fdatawrite_range was in progress, it could lead to a
misbehavior. Example: there are two pages in address space with indices
4, 5. Both are dirty. Someone calls __filemap_fdatawrite_range, it sets
.nr_to_write = 4. Meanwhile, some other process creates dirty pages 0,
1, 2, 3. __filemap_fdatawrite_range writes pages 0, 1, 2, 3, finds out
that it reached the limit and exits.

Result: pages that were dirty before __filemap_fdatawrite_range was
invoked were not written.

With starvation protection from the previous patch, this
mapping->nrpages * 2 logic is no longer needed.

Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>

---
mm/filemap.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-2.6.27-rc7-devel/mm/filemap.c
===================================================================
--- linux-2.6.27-rc7-devel.orig/mm/filemap.c 2008-09-24 14:47:01.000000000 +0200
+++ linux-2.6.27-rc7-devel/mm/filemap.c 2008-09-24 15:01:23.000000000 +0200
@@ -202,6 +202,11 @@ static int sync_page_killable(void *word
* opposed to a regular memory cleansing writeback. The difference between
* these two operations is that if a dirty page/buffer is encountered, it must
* be waited upon, and not just skipped over.
+ *
+ * Because new pages dirty can be created while this is executing, that
+ * mapping->nrpages * 2 condition is unsafe. If we are doing data integrity
+ * write, we must write all the pages. AS_STARVATION bit will eventually prevent
+ * creating more dirty pages to avoid starvation.
*/
int __filemap_fdatawrite_range(struct address_space *mapping, loff_t start,
loff_t end, int sync_mode)
@@ -209,7 +214,7 @@ int __filemap_fdatawrite_range(struct ad
int ret;
struct writeback_control wbc = {
.sync_mode = sync_mode,
- .nr_to_write = mapping->nrpages * 2,
+ .nr_to_write = sync_mode == WB_SYNC_NONE ? mapping->nrpages * 2 : LONG_MAX,
.range_start = start,
.range_end = end,
};
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/