Re: [RFC] cgroup TODOs

From: Tejun Heo
Date: Fri Sep 14 2012 - 15:29:34 EST


(cc'ing Lennart and Kay)

On Fri, Sep 14, 2012 at 09:58:30AM -0400, Vivek Goyal wrote:
> I am little concerned about above and wondering how systemd and libvirt
> will interact and behave out of the box.
> Currently systemd does not create its own hierarchy under blkio and
> libvirt does. So putting all together means there is no way to avoid
> the overhead of systemd created hierarchy.
> \
> |
> +- system
> |
> +- libvirtd.service
> |
> +- virt-machine1
> +- virt-machine2
> So there is now way to avoid the overhead of two levels of hierarchy
> created by systemd. I really wish that systemd gets rid of "system"
> cgroup and puts services directly in top level group. Creating deeper
> hieararchices is expensive.
> I just want to mention it clearly that with above model, it will not
> be possible for libvirt to avoid hierarchy levels created by systemd.
> So solution would be to keep depth of hierarchy as low as possible and
> to keep controller overhead as low as possible.

Yes, if we're do full unified hierarchy, nesting should happen iff
resource control actually requires the nesting so that tree depth is
kept minimal. Nesting shouldn't be used purely for organizational

> Now I know that with blkio idling kills performance. So one solution
> could be that on anything fast, don't use CFQ. Use deadline and then
> group idling overhead goes away and tools like systemd and libvirt don't
> have to worry about keeping track of disks and what scheduler is running.
> They don't want to do it and expect kernel to get it right.

I personally don't think the level of complexity we have in cfq is
something useful for the SSDs which are getting ever better. cfq is
allowed to use a lot of processing overhead and complexity because
disks are *so* slow. The balance already has completely changed with
SSDs and we should be doing something a lot simpler most likely based
on iops for them - be it deadline or whatever.

blkcg support is currently tied to cfq-iosched which sucks but I think
that could be the only way to achieve any kind of acceptable blkcg
support for rotating disks. I think what we should do is abstract out
the common organization part as much as possible so that we don't end
up duplicating everything for blk-throttle, cfq and, say, deadline.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at