Re: Commit 31a12666d8f0c22235297e1c1575f82061480029 slows downBerkeley DB

From: Zhang, Yanmin
Date: Mon Feb 02 2009 - 20:54:50 EST


On Tue, 2009-02-03 at 12:24 +1100, Nick Piggin wrote:
> On Friday 30 January 2009 12:23:15 Jan Kara wrote:
> > Hi,
> >
> > today I found that commit 31a12666d8f0c22235297e1c1575f82061480029 (mm:
> > write_cache_pages cyclic fix) slows down operations over Berkeley DB.
> > Without this "fix", I can add 100k entries in about 5 minutes 30s, with
> > that change it takes about 20 minutes.
> > What is IMO happening is that previously we scanned to the end of file,
> > we left writeback_index at the end of file and went to write next file.
> > With the fix, we wrap around (seek) and after writing some more we go
> > to next file (seek again).
We also found this commit causes about 40~50% regression with iozone mmap-rand-write.
#iozone -B -r 4k -s 64k -s 512m -s 1200m

My machine has 8GB memory.

> Hmm, but isn't that what pdflush has asked for? It is wanting to flush
> some of the dirty data out of this file, and hence it wants to start
> from where it last flushed out and then cycle back and flush more?
>
>
> > Anyway, I think the original semantics of "cyclic" makes more sence, just
> > the name was chosen poorly. What we should do is really scan to the end of
> > file, reset index to start from the beginning next time and go for the next
> > file.
>
> Well, if we think of a file as containing a set of dirty pages (as it
> appears to the high level mm), rather than a sequence, then behaviour
> of my patch is correct (ie. there should be no distinction between dirty
> pages at different offsets in the file).
>
> However, clearly there is some problem with that assumption if you're
> seeing a 4x slowdown :P I'd really like to know how it messes up the IO
> patterns. How many files in the BDB workload? Are filesystem blocks
> being allocated at the end of the file while writeout is happening?
> Delayed allocation?
>
>
> > I can write a patch to introduce this semantics but I'd like to hear
> > opinions of other people before I do so.
>
> I like dirty page cleaning to be offset agnostic as far as possible,
> but I can't argue with numbers like that. Though maybe it would be
> possible to solve it some other way.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/