Re: [GIT PULL] block/io bits for 2.6.35-rc

From: Mark Lord
Date: Sun Jun 27 2010 - 19:10:18 EST


On 10/06/10 12:44 PM, Brian Bloniarz wrote:
On 06/10/2010 12:25 PM, Jens Axboe wrote:
On 2010-06-10 17:55, Linus Torvalds wrote:
On Thu, Jun 10, 2010 at 6:44 AM, Jens Axboe<jaxboe@xxxxxxxxxxxx> wrote:

- A set of patches fixing the WB_SYNC_NONE writeback from Christoph. So
we should finally have both functional and working WB_SYNC_NONE from
umount context.

I _really_ think this is too late, considering how broken it has been.
We already reverted the WB_SYNC_NONE things exactly because it didn't
work, didn't we? I'm going to be off-line in two days, and this part
of the pull request really makes me nervous, if only simply because of
the history of it all (ie it's always been broken, why shouldn't it be
broken now?).

IOW, that's a lot of scary changes, that have historically not been
safe or sufficiently tested, and have caused problems for various
filesystems. Convince me why they should suddenly be ok to merge?

I agree, it's late and it makes me nervous too. I had them cook for
a day, didn't see any problems. And Christoph would not send it in
unless it passes at least xfs qa, which is what found the problems
last time (the ones we reverted).

It's fixing a regression where umount takes a LONG time if you have
a lot of dirty inodes, since it basically degenerates to a data
integrity writeback instead of a simple WB_SYNC_NONE. If it wasn't
fixing a nasty regression (the distros are all wanting a real fix
for this, it's a user problem), I would not be submitting this code
at this point in time.


Reinforcing that last point: from what I could figure out, Fedora 13
is shipping the buggy WB_SYNC_NONE patch currently. Ubuntu 10.04 is
shipping an in-kernel workaround that has serious performance
drawbacks.

https://bugzilla.kernel.org/show_bug.cgi?id=15906 has links to the
downstream bugs.
..

Jens, this bug has been biting my servers badly here for the past
few months -- umount after a backup (from ext4 to ext4) takes 3-4 minutes
instead of the expected 3-4 seconds.

Is there a patch file for this against 2.6.34 that I (and others) could use?

Thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/