Re: [PATCH cgroup/for-3.19-fixes] cgroup: implement cgroup_subsys->unbind() callback

From: Vladimir Davydov
Date: Mon Jan 12 2015 - 08:00:14 EST


On Mon, Jan 12, 2015 at 06:28:45AM -0500, Tejun Heo wrote:
> On Mon, Jan 12, 2015 at 11:01:14AM +0300, Vladimir Davydov wrote:
> > Come to think of it, I wonder how many users actually want to mount
> > different controllers subset after unmount. Because we could allow
>
> It wouldn't be a common use case but, on the face of it, we still
> support it. If we collecctively decide that once a sub cgroup is
> created for any controller no further hierarchy configuration for that
> controller is allowed, that'd work too, but one way or the other, the
> behavior, I believe, should be well-defined. As it currently stands,
> the conditions and failure mode are opaque to userland, which is never
> a good thing.
>
> > mounting the same subset perfectly well, even if it includes memcg. BTW,
> > AFAIU in the unified hierarchy we won't have this problem at all,
> > because by definition it mounts all controllers IIRC, so do we need to
> > bother fixing this in such a complicated manner at all for the setup
> > that's going to be deprecated anyway?
>
> There will likely be a quite long transition period and if and when
> the old things can be removed, this added cleanup logic can go away
> with it. It depends on how complex the implementation would get but
> as long as it isn't too much and stays mostly isolated from the saner
> paths, I think it's probably the right thing to do.

We can't just move kmem objects from a per-memcg kmem_cache to the
global one fixing page counters, because in contrast to page cache and
swap we don't even track all kmem allocations. So we have to keep all
per-memcg kmem_cache's somewhere after unmount until they can finally be
destroyed, but the whole logic behind per-memcg kmem_cache's destruction
is currently tightly interwoven with that of css's (we destroy
kmem_cache's from css_free), and there won't be any css's after unmount.

That said, it isn't possible to add a couple of isolated functions,
which will live their own lives and can be easily removed once we've
switched to the unified hierarchy. Quite the contrary, implementing of
kmem reparenting would make me rethink and complicate kmemcg code all
over the place. That's why I'm rather reluctant to do it.

I haven't dug deep into the cgroup core, but may be we could detach the
old root in cgroup_kill_sb() and leave it dangling until the last
reference to it has gone?

BTW, IIRC the problem always existed for kmem-active memory cgroups,
because we never had kmem reparenting. May be, we could therefore just
document somewhere that kmem accounting is highly discouraged to be used
in the legacy hierarchy and merge these two patches as is to handle page
cache and swap charges? We won't break anything, because it was always
broken :-)

Thanks,
Vladimir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/