Re: ext3_ordered_writepage() questions

From: Suparna Bhattacharya
Date: Fri Mar 17 2006 - 21:55:10 EST


On Fri, Mar 17, 2006 at 05:22:13PM -0500, Stephen C. Tweedie wrote:
> Hi,
>
> On Fri, 2006-03-17 at 13:32 -0800, Badari Pulavarty wrote:
>
> > I have a patch which eliminates adding buffers to the journal, if
> > we are doing just re-write of the disk block. ...
>
> > 2.6.16-rc6 2.6.16-rc6+patch
> > real 0m6.606s 0m3.705s
>
> OK, that's a really significant win! What exactly was the test case for
> this, and does that performance edge persist for a longer-running test?
>
> > In real world, does this ordering guarantee matter ?
>
> Not that I am aware of. Even with the ordering guarantee, there is
> still no guarantee of the order in which the writes hit disk within that
> transaction, which makes it hard to depend on it.
>
> I recall that some versions of fsync depended on ordered mode flushing
> dirty data on transaction commit, but I don't think the current
> ext3_sync_file() will have any problems there.
>
> Other than that, the only thing I can think of that had definite
> dependencies in this are was InterMezzo, and that's no longer in the
> tree. Even then, I'm not 100% certain that InterMezzo had a dependency
> for overwrites (it was certainly strongly dependent on the ordering
> semantics for allocates.)

Besides we seem to have already broken the guarantee in async DIO
writes for the overwrite case.

Regards
Suparna

>
> It is theoretically possible to write applications that depend on that
> ordering, but they would be necessarily non-portable anyway. I think
> relaxing it is fine, especially for a 100% (wow) performance gain.
>
> There is one other perspective to be aware of, though: the current
> behaviour means that by default ext3 generally starts flushing pending
> writeback data within 5 seconds of a write. Without that, we may end up
> accumulating a lot more dirty data in memory, shifting the task of write
> throttling from the filesystem to the VM.
>
> That's not a problem per se, just a change of behaviour to keep in mind,
> as it could expose different corner cases in the performance of
> write-intensive workloads.
>
> --Stephen
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Suparna Bhattacharya (suparna@xxxxxxxxxx)
Linux Technology Center
IBM Software Lab, India

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/