Re: [PATCH -tip 26/32] sched: Add a second-level tag for nested CGroup usecase

From: Peter Zijlstra
Date: Wed Nov 25 2020 - 08:43:44 EST


On Tue, Nov 17, 2020 at 06:19:56PM -0500, Joel Fernandes (Google) wrote:
> From: Josh Don <joshdon@xxxxxxxxxx>
>
> Google has a usecase where the first level tag to tag a CGroup is not
> sufficient. So, a patch is carried for years where a second tag is added which
> is writeable by unprivileged users.
>
> Google uses DAC controls to make the 'tag' possible to set only by root while
> the second-level 'color' can be changed by anyone. The actual names that
> Google uses is different, but the concept is the same.
>
> The hierarchy looks like:
>
> Root group
> / \
> A B (These are created by the root daemon - borglet).
> / \ \
> C D E (These are created by AppEngine within the container).
>
> The reason why Google has two parts is that AppEngine wants to allow a subset of
> subcgroups within a parent tagged cgroup sharing execution. Think of these
> subcgroups belong to the same customer or project. Because these subcgroups are
> created by AppEngine, they are not tracked by borglet (the root daemon),
> therefore borglet won't have a chance to set a color for them. That's where
> 'color' file comes from. Color could be set by AppEngine, and once set, the
> normal tasks within the subcgroup would not be able to overwrite it. This is
> enforced by promoting the permission of the color file in cgroupfs.

Why can't the above work by setting 'tag' (that's a terrible name, why
does that still live) in CDE? Have the most specific tag live. Same with
that thread stuff.

All this API stuff here is a complete and utter trainwreck. Please just
delete the patches and start over. Hint: if you use stop_machine(),
you're doing it wrong.

At best you now have the requirements sorted.