Re: [PATCH 1/2] Add mempressure cgroup

From: Andrew Morton
Date: Wed Jan 09 2013 - 15:28:08 EST


On Wed, 9 Jan 2013 18:10:02 +0400
Glauber Costa <glommer@xxxxxxxxxxxxx> wrote:

> On 01/09/2013 01:44 AM, Andrew Morton wrote:
> > On Fri, 4 Jan 2013 00:29:11 -0800
> > Anton Vorontsov <anton.vorontsov@xxxxxxxxxx> wrote:
> >
> >> This commit implements David Rientjes' idea of mempressure cgroup.
> >>
> >> The main characteristics are the same to what I've tried to add to vmevent
> >> API; internally, it uses Mel Gorman's idea of scanned/reclaimed ratio for
> >> pressure index calculation. But we don't expose the index to the userland.
> >> Instead, there are three levels of the pressure:
> >>
> >> o low (just reclaiming, e.g. caches are draining);
> >> o medium (allocation cost becomes high, e.g. swapping);
> >> o oom (about to oom very soon).
> >>
> >> The rationale behind exposing levels and not the raw pressure index
> >> described here: http://lkml.org/lkml/2012/11/16/675
> >>
> >> For a task it is possible to be in both cpusets, memcg and mempressure
> >> cgroups, so by rearranging the tasks it is possible to watch a specific
> >> pressure (i.e. caused by cpuset and/or memcg).
> >>
> >> Note that while this adds the cgroups support, the code is well separated
> >> and eventually we might add a lightweight, non-cgroups API, i.e. vmevent.
> >> But this is another story.
> >>
> >
> > I'd have thought that it's pretty important offer this feature to
> > non-cgroups setups. Restricting it to cgroups-only seems a large
> > limitation.
> >
>
> Why is it so, Andrew?
>
> When we talk about "cgroups", we are not necessarily talking about the
> whole beast, with all controllers enabled. Much less we are talking
> about hierarchies being created, and tasks put on it.
>
> It's an interface only. And since all controllers will always have a
> special "root" cgroup, this applies to the tasks in the system all the
> same. In the end of the day, if we have something like
> CONFIG_MEMPRESSURE that selects CONFIG_CGROUP, the user needs to do the
> same thing to actually turn on the functionality: switch a config
> option. It is not more expensive, and it doesn't bring in anything extra
> as well.
>
> To actually use it, one needs to mount the filesystem, and write to a
> file. Nothing else.
>

Oh, OK, well if the feature can be used in a system-wide fashion in
this manner then I guess that is sufficient. For some reason I was
thinking it was tied to memcg, doh.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/