Re: [RFC] [PATCH -mm 0/2] memcg: per cgroup dirty_ratio

From: Andrew Morton
Date: Fri Sep 12 2008 - 16:20:17 EST

On Fri, 12 Sep 2008 17:09:50 +0200
Andrea Righi <righi.andrea@xxxxxxxxx> wrote:

> The goal of the patch is to control how much dirty file pages a cgroup can have
> at any given time (see also [1]).
> Dirty file and writeback pages are accounted for each cgroup using the memory
> controller statistics. Moreover, the dirty_ratio parameter is added to the
> memory controller. It contains, as a percentage of the cgroup memory, the
> number of dirty pages at which the processes belonging to the cgroup which are
> generating disk writes will start writing out dirty data.
> So, the behaviour is actually the same as the global dirty_ratio, except that
> it works per cgroup.
> Interface:
> - two new entries "writeback" and "filedirty" are added to the file
> memory.stat, to export to userspace respectively the number of pages under
> writeback and the number of dirty file pages in the cgroup
> - the new file memory.dirty_ratio is added in the cgroup filesystem to show/set
> the memcg dirty_ratio

Seems like a desirable objective.

> [ This patch is still experimental and I only did few quick tests. I'd like to
> do run more detailed benchmarks and compare the results, I guess the overhead
> introduced by this patch shouldn't be so small... and BTW I would prefer a
> dirty limit in bytes, intead of using a percentage of memory. Bytes are hugely
> more flexible IMHO, they allow to define more fine-grained limits and so this
> would work better on large memory machines. ]
> [1]

I tend to duck experimental and rfc patches ;)

One thing to think about please: Michael Rubin is hitting problems with
the existing /proc/sys/vm/dirty-ratio. Its present granularity of 1%
is just too coarse for really large machines, and as
memory-size/disk-speed ratios continue to increase, this will just get

So after thinking about it a bit I encouraged him to propose a patch
which adds a new /proc/sys/vm/hires-dirty-ratio (for some value of
"hires" ;)) which simply offers a higher-resolution interface to the
same internal kernel machinery.

How does this affect you? I don't think we should be adding new
interfaces which have the old 1%-resolution problem. Once we get this
higher-resolution interface sorted out, your new interface should do it
the same way.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at