Re: [VM PATCH 2.6.6-rc3-bk5] Dirty balancing in the presence ofmapped pages

From: Nikita Danilov
Date: Wed May 05 2004 - 11:57:55 EST


Andrew Morton writes:
> Nick Piggin <nickpiggin@xxxxxxxxxxxx> wrote:
> >
> > Andrew Morton wrote:
> > > Shantanu Goel <sgoel01@xxxxxxxxx> wrote:
> > >
> > >>Presently the kernel does not collection information
> > >>about the percentage of memory that processes have
> > >>dirtied via mmap until reclamation. Nothing analogous
> > >>to balance_dirty_pages() is being done for mmap'ed
> > >>pages. The attached patch adds collection of dirty
> > >>page information during kswapd() scans and initiation
> > >>of background writeback by waking up bdflush.
> > >
> > >
> > > And what were the effects of this patch?
> > >
> >
> > I havea modified patch from Nikita that does the
> > if (ptep_test_and_clear_dirty) set_page_dirty from
> > page_referenced, under the page_table_lock.
>
> Dude. I have lots of patches too. The question is: what use are they?

Learning patch-scripts? :)

>
> In this case, given that we have an actively mapped MAP_SHARED pagecache
> page, marking it dirty will cause it to be written by pdflush. Even though
> we're not about to reclaim it, and even though the process which is mapping
> the page may well modify it again. This patch will cause additional I/O.

Dirty bit is transferred to the struct page when page is moved to the
inactive list, where pages are not supposedly referenced/dirtied
frequently. Besides, additional IO, if any, will be done through
->writepages() which is much more efficient than single-page pageout
from tail of the inactive list.

>
> So we need to understand why it was written, and what effects were
> observed, with what workload, and all that good stuff.

Another possible scenario where early transfer of dirty bit could be
useful: huge file consisting of single hole is mmapped, and user level
starts dirtying all pages. Current VM thinks happily that all memory is
clean, ->writepages() is not invoked. VM scanning starts, shrink_list()
dirties page-at-a-time
(shrink_list()->try_to_unmap()->set_page_dirty()), and calls
->writepage() that has to insert meta-data for the hole page (extent,
indirect pointer, whatever), and to submit IO. As order of pages at the
inactive list corresponds to the order of page faults (i.e., random),
this will result in the horrible fragmentation.

Nikita.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/