Re: [RFC] cgroup TODOs

From: Tejun Heo
Date: Fri Sep 14 2012 - 17:57:02 EST


Hello, Vivek, Peter.

On Fri, Sep 14, 2012 at 11:14:47AM -0400, Vivek Goyal wrote:
> We don't have to start with 0%. We can keep a pool with dynamic % and
> launch all the virtual machines from that single pool. So nobody starts
> with 0%. If we require certain % for a machine, only then we look at
> peers and see if we have bandwidth free and create cgroup and move virtual
> machine there, otherwise we deny resources.
>
> So I think it is doable just that it is painful and tricky and I think
> lot of it will be in user space.

I think the system-wide % thing is rather distracting for the
discussion at hand (and I don't think being able to specify X% of the
whole system when you're three level down the resource hierarchy makes
sense anyway). Let's focus on tasks vs. groups.

> > > So
> > > an easier way is to stick to the model of relative weights/share and
> > > let user specify relative importance of a virtual machine and actual
> > > quota or % will vary dynamically depending on other tasks/components
> > > in the system.
> > >
> > > Thoughts?
> >
> > cpu does the relative weight, so 'users' will have to deal with it
> > anyway regardless of blk, its effectively free of learning curve for all
> > subsequent controllers.
>
> I am inclined to keep it simple in kernel and just follow cpu model of
> relative weights and treating tasks and gropu at same level in the
> hierarchy. It makes behavior consistent across the controllers and I
> think it might just work for majority of cases.

I think we need to stick to one model for all controllers; otherwise,
it gets confusing and unified hierarchy can't work. That said, I'm
not too happy about how cpu is handling it now.

* As I wrote before, the configuration esacpes cgroup proper and the
mapping from per-task value to group weight is essentially
arbitrary and may not exist depending on the resource type.

* The proportion of each group fluctuates as tasks fork and exit in
the parent group, which is confusing.

* cpu deals with tasks but blkcg deals with iocontexts and memcg,
which currently doesn't implement proportional control, deals with
address spaces (processes). The proportions wouldn't even fluctuate
the same way across different controllers.

So, I really don't think the current model used by cpu is a good one
and we rather should treat the tasks as a group competing with the
rest of child groups. Whether we can change that at this point, I
don't know. Peter, what do you think?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/