Re: [RFC PATCH-cgroup 2/6] cgroup: Enable bypass mode in cgroup v2

From: Waiman Long
Date: Thu Jun 22 2017 - 16:07:29 EST


On 06/21/2017 05:17 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> Let's first talk about and make sense of high level semantics.
>
> On Wed, Jun 14, 2017 at 11:05:33AM -0400, Waiman Long wrote:
>> +In the example below, '+' corresponds to an enabled controller and
>> +corresponds to a bypassed controller.
>> +
>> + + # # # +
>> + A - B - C - D - E
>> + \ F
>> + +
>> +In this case, the effective hiearchy is:
>> +
>> + A|B|C|D - E
>> + \ F
> I think that this definitely has potential. While different
> controllers may see differently abbreviated versions of the tree, they
> can still be mapped to the same hierarchy and we can implement
> cross-controller operations in a meaningful way, I think; however, it
> does make some things really weird.
>
> In the above example, how would A's resources be distributed. Let's
> say the resource knob in question is memory.high. Because from memory
> controller's point of view A|B|C|D are all bunched up and have E and F
> as children, memory.high resource knobs on E and F would control how
> A's memory gets distributed, right?

That is right.

> So, once a parent skips a controller with #, you can only determine
> how its resources are actually distributed by scanning the entire
> subtree to determine the span of '#' on the controller and any sort of
> delegation - whether implicit or explicit - wouldn't be possible in
> the middle, right?

That is right, too.

> Can you please think of / explain how this would work with delegation?
> Making things clear with delegation is really helpful because it can
> serve as the canary for the usual hierarchical operations.

+ # +
A - B - C
\ D +

In term of delegation story, I would say that for the above
configuration, the parent A can delegate 5 units of resources to B. B,
upon finding out it has 5 units of resources, may decide to take itself
out of the picture (bypass itself) and delegate, say, 2 units to C and 3
units to D assuming that B has no internal process.

Of course, B can decide to ignore the rules, bypass itself and add
internal process to compete with other children of A. So we can make
"cgroup.controllers" non-writable to B if we don't want any controllers
to be bypassed.

As a side note, I have another delegation story for enabling bypass mode
in subtree_control. A parent can activate a controller in bypass mode to
signal that it has delegated the authority to enable a controller to its
children. A child can then activate a controller by writing '+' to its
cgroup.controllers (not implemented yet). In this scenario, the child
own the control knobs, not the parent. That can be useful for
controllers that deal with ID or membership like devices, freezer,
perf_event or even cpusets. You may not want to have separate IDs for
all the nodes in the hierarchy, but for those who need a different ID,
they can choose to do that. In fact, I am thinking if it may be useful
to define a bypass_on_dfl attribute that work like implicit_on_dfl so
that we don't need to explicitly set those controllers in bypass mode.

Cheers,
Longman