Re: Linux 2.6.29

From: Linus Torvalds
Date: Fri Apr 03 2009 - 11:11:47 EST




On Fri, 3 Apr 2009, Chris Mason wrote:

> On Thu, 2009-04-02 at 20:34 -0700, Linus Torvalds wrote:
> >
> > Well, one rather simple explanation is that if you hadn't been doing lots
> > of writes, then the background garbage collection on the Intel SSD gets
> > ahead of the game, and gives you lots of bursty nice write bandwidth due
> > to having a nicely compacted and pre-erased blocks.
> >
> > Then, after lots of writing, all the pre-erased blocks are gone, and you
> > are down to a steady state where it needs to GC and erase blocks to make
> > room for new writes.
> >
> > So that part doesn't suprise me per se. The Intel SSD's definitely
> > flucutate a bit timing-wise (but I love how they never degenerate to the
> > "ooh, that _really_ sucks" case that the other SSD's and the rotational
> > media I've seen does when you do random writes).
> >
>
> 23MB/s seems a bit low though, I'd try with O_DIRECT. ext3 doesn't do
> writepages, and the ssd may be very sensitive to smaller writes (what
> brand?)

I didn't realize that Jeff had a non-Intel SSD.

THAT sure explains the huge drop-off. I do see Intel SSD's fluctuating
too, but the Intel ones tend to be _fairly_ stable.

> > The fact that it also happens for the regular disk does imply that it's
> > not the _only_ thing going on, though.
>
> Jeff if you blktrace it I can make up a seekwatcher graph. My bet is
> that pdflush is stuck writing the indirect blocks, and doing a ton of
> seeks.
>
> You could change the overwrite program to also do sync_file_range on the
> block device ;)

Actually, that won't help. 'sync_file_range()' works only on the virtually
indexed page cache, and I think ext3 uses "struct buffer_head *" for all
it's metadata updates (due to how JBD works). So sync_file_range() will do
nothing at all to the metadata, regardless of what mapping you execute it
on.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/