Re: [PATCH RFC cgroup/for-3.7] cgroup: mark subsystems with brokenhierarchy support and whine if cgroups are nested for them

From: Tejun Heo
Date: Wed Sep 12 2012 - 13:04:14 EST


Hello, Glauber.

On Wed, Sep 12, 2012 at 01:29:09PM +0400, Glauber Costa wrote:
> Haven't gone through the whole patch yet, and not sure how you actually
> touch memcg in here. And I absolutely know we have discussed this
> before, but I still stand that for the memcg case, in which hierarchy
> can be enabled by a crazy boolean, we should be enabling it somehow. It
> is fine if we don't want to change the default without warning first,
> but a Kconfig option to make this default would really help. We should
> tell everybody with a well defined lifecycle to just enable it.

I'm not really sure how useful the Kconfig would be. I'm not gonna
nack it but am not sure it's useful either. Michal seems to be in the
same boat, so I suppose there's no strong opposition.

I don't think it would make life easier for distros. It could differ
depending on distros but in my experience with SUSE trivial patches
flipping the default aren't big deals especially if upstream has
transition plan in place. The difficulty here is that somebody needs
to assess the situation and preferably make that decision conciously -
the mechanism to do so be it a one liner patch or Kconfig option
doesn't really matter and across the transition period we would want
to keep the memcg behavior consistent regardless which kernel is in
use.

The problem with Kconfig is that we shouldn't enable the new behavior
by default as that would change the behavior silently and if we can't
do that it's just something which is buried under the sea of config
options. Kconfig or no, we need to coordinate with the distros.

>From upstream, my current plan for .use_hierarchy is

* Warn about broken hierarchy usage in increasing verbosity.

* After a couple releases, warn about creating any mem cgroup if
.use_hierarchy == 0 at root.

* After a couple releases, switch .use_hierarchy to 1 by default and
loudly warn on any attempts to set it to zero.

* Rip out flat hierarchy support and fail any attempt to set
.use_hierarchy to 0.

It'll take some months but I don't think it's too crazy and I think
the whole process should take longer than eight months to ensure any
active distro notices it.

Distros should set .use_hierarchy to 1 on mounting memcg. This
probably should happen on a new release w/ accompanying release note.
I'll try to coordinate it at least for the popular ones.

> > + * It's now diallowed to create nested cgroups if the subsystem is
> typo, disallowed.

Ooh, will fix.

> > + * broken and cgroup core will emit a warning message on such
> > + * cases. Eventually, all subsystems will be made properly
> > + * hierarchical and this will go away.
> > + */
> > + bool broken_hierarchy;
> > + bool warned_broken_hierarchy;
> > +
>
> why do we need the extra bool? Isn't WARN_ON_ONCE() suitable here?

We want to warn once per subsys instead of once for the whole system.

> > + /*
> > + * net_prio has artificial limit on the number of cgroups and
> > + * disallows nesting making it impossible to co-mount it with other
> > + * hierarchical subsystems. Remove the artificially low PRIOIDX_SZ
> > + * limit and properly nest configuration such that children follow
> > + * their parents' configurations by default and are allowed to
> > + * override and remove the following.
> > + */
> > + .broken_hierarchy = trye,
> > };
>
> "trye" doesn't seem to be a recognized word.

Yeah, fixed.

> > static int netprio_device_event(struct notifier_block *unused,
> > --- a/net/sched/cls_cgroup.c
> > +++ b/net/sched/cls_cgroup.c
> > @@ -82,6 +82,15 @@ struct cgroup_subsys net_cls_subsys = {
> > #endif
> > .base_cftypes = ss_files,
> > .module = THIS_MODULE,
> > +
> > + /*
> > + * While net_cls cgroup has the rudimentary hierarchy support of
> > + * inheriting the parent's classid on cgroup creation, it doesn't
> > + * properly propagates config changes in ancestors to their
> > + * descendents. A child should follow the parent's configuration
> > + * but be allowed to override it. Fix it and remove the following.
> > + */
> > + .broken_hierarchy = true,
> > };
> >
>
> Since all this cgroup provides is a marking, it is not terribly obvious
> to me what "proper hierarchy" would mean. Input from the authors would
> be strongly advisable here.

Setting mark on a parent should be reflected on all its children w/o
their own explicit settings.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/