Re: [PATCH] mm/tracking dirty pages: update get_dirty_limits for mmap tracking

From: Nick Piggin
Date: Wed Jun 21 2006 - 14:07:53 EST


On Wed, Jun 21, 2006 at 10:01:17AM -0700, Nate Diller wrote:
> Update write throttling calculations now that we can track and
> throttle dirty mmap'd pages. A version of this patch has been tested
> with iozone:

Your changelog doesn't tell much about the "why" side of things,
and omits the fact that you have upped the dirty ratio to 80.

>
> http://namesys.com/intbenchmarks/iozone/06.06.19.tracking.dirty.page-noatime_-B/e3-2.6.16-tr.drt.pgs-rt.40_vs_rt.80.html
> http://namesys.com/intbenchmarks/iozone/06.06.19.tracking.dirty.page-noatime_-B/r4-2.6.16-tr.drt.pgs-rt.40_vs_rt.80.html

I'm guessing the reason you get all those red numbers when
iozone files are larger than RAM is because writeout and reclaim
tend to get worse when there are large amounts of dirty pages
floating around in memory?

>
> Signed-off-by: Nate Diller <nate.diller@xxxxxxxxx>
>
> --- linux-2.6.orig/mm/page-writeback.c 2005-10-27 17:02:08.000000000 -0700
> +++ linux-2.6/mm/page-writeback.c 2006-06-21 08:24:11.000000000 -0700
> @@ -69,7 +69,7 @@ int dirty_background_ratio = 10;
> /*
> * The generator of dirty data starts writeback at this percentage
> */
> -int vm_dirty_ratio = 40;
> +int vm_dirty_ratio = 80;
>
> /*
> * The interval between `kupdate'-style writebacks, in centiseconds
> @@ -119,15 +119,14 @@ static void get_writeback_state(struct w
> * Work out the current dirty-memory clamping and background writeout
> * thresholds.
> *
> - * The main aim here is to lower them aggressively if there is a lot of
> mapped
> - * memory around. To avoid stressing page reclaim with lots of
> unreclaimable
> - * pages. It is better to clamp down on writers than to start swapping,
> and
> - * performing lots of scanning.
> - *
> - * We only allow 1/2 of the currently-unmapped memory to be dirtied.
> - *
> - * We don't permit the clamping level to fall below 5% - that is getting
> rather
> - * excessive.
> + * We now have dirty memory accounting for mmap'd pages, so we calculate
> the
> + * ratios based on the available memory. We still have no way of tracking
> + * how many pages are pinned (eg BSD wired accounting), so we still need
> the
> + * hard clamping, but the default has been raised to 80.
> + *
> + * We now allow the ratios to be set to anything, because there is less
> risk
> + * of OOM, and because databases and such will need more flexible tuning,
> + * now that they are being throttled too.
> *
> * We make sure that the background writeout level is below the adjusted
> * clamping level.
> @@ -136,9 +135,6 @@ static void
> get_dirty_limits(struct writeback_state *wbs, long *pbackground, long
> *pdirty,
> struct address_space *mapping)
> {
> - int background_ratio; /* Percentages */
> - int dirty_ratio;
> - int unmapped_ratio;
> long background;
> long dirty;
> unsigned long available_memory = total_pages;
> @@ -155,27 +151,16 @@ get_dirty_limits(struct writeback_state
> available_memory -= totalhigh_pages;
> #endif
>
> -
> - unmapped_ratio = 100 - (wbs->nr_mapped * 100) / total_pages;
> -
> - dirty_ratio = vm_dirty_ratio;
> - if (dirty_ratio > unmapped_ratio / 2)
> - dirty_ratio = unmapped_ratio / 2;
> -
> - if (dirty_ratio < 5)
> - dirty_ratio = 5;
> -
> - background_ratio = dirty_background_ratio;
> - if (background_ratio >= dirty_ratio)
> - background_ratio = dirty_ratio / 2;
> -
> - background = (background_ratio * available_memory) / 100;
> - dirty = (dirty_ratio * available_memory) / 100;
> + background = (dirty_background_ratio * available_memory) / 100;
> + dirty = (vm_dirty_ratio * available_memory) / 100;
> tsk = current;
> if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
> background += background / 4;
> - dirty += dirty / 4;
> + dirty += dirty / 8;
> }
> + if (background > dirty)
> + background = dirty;
> +
> *pbackground = background;
> *pdirty = dirty;
> }
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/