Re: [PATCH 02/11] writeback: switch to per-bdi threads forflushing data

From: Jens Axboe
Date: Tue May 19 2009 - 13:56:58 EST


On Tue, May 19 2009, Richard Kennedy wrote:
> On Tue, 2009-05-19 at 14:23 +0200, Jens Axboe wrote:
> > On Tue, May 19 2009, Richard Kennedy wrote:
> > > Jens Axboe wrote:
> > > > This gets rid of pdflush for bdi writeout and kupdated style cleaning.
> > > > <snip>
> > > > index 2296ff4..76269f8 100644
> > > > --- a/mm/page-writeback.c
> > > > +++ b/mm/page-writeback.c
> > > > @@ -541,7 +530,7 @@ static void balance_dirty_pages(struct address_space *mapping)
> > > > * been flushed to permanent storage.
> > > > */
> > > > if (bdi_nr_reclaimable) {
> > > > - writeback_inodes(&wbc);
> > > > + generic_sync_bdi_inodes(NULL, &wbc);
> > > > pages_written += write_chunk - wbc.nr_to_write;
> > > > get_dirty_limits(&background_thresh, &dirty_thresh,
> > > > &bdi_thresh, bdi);
> > > > @@ -592,7 +581,7 @@ static void balance_dirty_pages(struct address_space *mapping)
> > > > (!laptop_mode && (global_page_state(NR_FILE_DIRTY)
> > > > + global_page_state(NR_UNSTABLE_NFS)
> > > > > background_thresh)))
> > > > - pdflush_operation(background_writeout, 0);
> > > > + bdi_start_writeback(bdi, NULL, 0);
> > > > }
> > > >
> > > Hi Jens,
> > >
> > > I'm interested in this slight change of behaviour, when over the
> > > background dirty limit background_writeout will write any dirty pages
> > > while bdi_start_writeout writes only pages for the current bdi. Are
> > > there any benefits in making this change?
> > >
> > > Thinking about the case of 2 apps writing to different bdis. When app A
> > > stops writing, then next time app B goes over the background dirty
> > > threshold it will only be able to write its own pages, leaving any from
> > > app A dirty until they reach their age limit.
> >
> > The function in question balances dirty pages against a specific address
> > space, which has a specific mapping. The async part of the background
> > writeout could be global as you mention. The whole thing is a bit weird
> > in balance_dirty_pages(), for instance it checks for writeout against a
> > given queue then proceeds to do a global writeout if not busy. At least
> > it's consistent now.
> >
> > > So we may be keeping dirty pages for the app that's finished longer than
> > > necessary. Keeping pages for a finished app while flushing pages from a
> > > running app seems a bit strange. I guess this is an odd corner case and
> > > may not be worth worrying about, but I'd be interested to hear what you
> > > think.
> >
> > The kupdated() initiated background writeout will take care of that, if
> > nobody does a sync on that data first. If nobody is dirtying new data on
> > the given bdi, then it seems perfectly fine to let normal background
> > writeout handle it.
> >
> > > Do you think your new code will require any changes to the per bdi dirty
> > > limits? It may be informative & interesting to run some tests writing to
> > > fast & slow devices at the same time.
> >
> > Generally the code should behave fairly closely to the existing pdflush
> > based code, so I don't think bdi dirty limit tweaking will be necessary.
> > I'd definitely welcome some testing though, particularly slow vs fast as
> > you mention. I've mainly been doing benchmarking to make sure we don't
> > regress on performance, and that has been for fairly similar hardware.
> > Since testing does take a lot of time, it would be nice if someone else
> > would gather their own experiences, especially in areas that have been
> > problematic in the past (slow vs fast devices, for instance!).
>
> Thanks for the explanation.
> I'm definitely going to test this, although I don't have any interesting
> hardware, only a basic workstation. But I'll let you know if I turn up
> anything useful.

Any testing is useful, so go for it.

> Balance_dirty_pages contains Peter Zijlstra's per bdi write throttling
> code and I wonder if it will need tuning for best performance with your
> changes, just because some of its assumptions may have changed. I'll run
> some tests here and see what happens. Peter may have some insight and
> possibly useful test cases.

I'm assuming those are setting in -mm? I'll take a look.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/