Re: BUG: Failure to send REQ_FLUSH on unmount on ext3, ext4, and FSin general

From: Alex Bligh
Date: Mon May 23 2011 - 14:50:49 EST


Christoph,

--On 23 May 2011 13:52:04 -0400 Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:

On Mon, May 23, 2011 at 06:39:23PM +0100, Alex Bligh wrote:
I'm presuming that if just umount() were altered to do a REQ_FLUSH,
the potential presence of 2 sync()s would not be too offensive, as
unmount isn't exactly time critical, and as Christoph pointed out in
the other thread, a REQ_FLUSH when the write cache has recently been
emptied isn't going to take long.

Umount actually is the only place where adding it generically makes
sense. It's not time-critical, and with kill_block_super we actually
have a block specific place to put it, instead of having to hack
it into the generic VFS, which is something we've been trying to avoid.

You mean like this (completely untested)?

diff --git a/fs/super.c b/fs/super.c
index 8a06881..a86201a 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -852,6 +852,7 @@ void kill_block_super(struct super_block *sb)
bdev->bd_super = NULL;
generic_shutdown_super(sb);
sync_blockdev(bdev);
+ blkdev_issue_flush(bdev, GFP_KERNEL, NULL);
WARN_ON_ONCE(!(mode & FMODE_EXCL));
blkdev_put(bdev, mode | FMODE_EXCL);
}


One thing I am puzzled by is that blkdev_fsync unconditionally
calls blkdev_issue_flush, but no amount of fsync(), sync() or
whatever generates any REQ_FLUSH traffic. The only explanation
I can guess at for that is that blkdev_issue_flush is a NOOP
if the driver doesn't have a make_request_function:

/*
* some block devices may not have their queue correctly set up here
* (e.g. loop device without a backing file) and so issuing a flush
* here will panic. Ensure there is a request function before issuing
* the flush.
*/
if (!q->make_request_fn)
return -ENXIO;

According to Documentation/block/writeback_cache_control.txt, drivers
with a request_fn are still meant to get REQ_FLUSH etc. provided
they have done:

blk_queue_flush(sdkp->disk->queue, REQ_FLUSH);

So should that read (again untested) as follows:

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 6c9b5e1..3a6d4bd 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -408,7 +408,8 @@ int blkdev_issue_flush(struct block_device *bdev, gfp_t gfp_mask,
* here will panic. Ensure there is a request function before issuing
* the flush.
*/
- if (!q->make_request_fn)
+ if (!q->make_request_fn &&
+ !(q->request_fn && (q->flush_flags & REQ_FLUSH)))
return -ENXIO;

bio = bio_alloc(gfp_mask, 0);


Ah, fsdevel not here. OK. Partly I'd like to understand whether
sync() not flushing write caches on barrier-less file systems
is a good thing or a bad thing. I know barriers are better, but if
writing to (e.g.) FAT32, I'm betting there is little prospect of
barrier support.

"Barrier" support it's gone. It's really just the FUA and FLUSH
flags these days.

Sorry - slack terminology on my part.

--
Alex Bligh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/