Re: [RFC] cgroup TODOs

From: Daniel P. Berrange
Date: Fri Sep 14 2012 - 08:55:14 EST


On Fri, Sep 14, 2012 at 01:15:02PM +0200, Peter Zijlstra wrote:
> On Thu, 2012-09-13 at 13:58 -0700, Tejun Heo wrote:
> > The cpu ones handle nesting correctly - parent's accounting includes
> > children's, parent's configuration affects children's unless
> > explicitly overridden, and children's limits nest inside parent's.
>
> The implementation has some issues with fixed point math limitations on
> deep hierarchies/large cpu count, but yes.
>
> Doing soft-float/bignum just isn't going to be popular I guess ;-)
>
> People also don't seem to understand that each extra cgroup carries a
> cost and that nested cgroups are more expensive still, even if the
> intermediate levels are mostly empty (libvirt is a good example of how
> not to do things).
>
> Anyway, I guess what I'm saying is that we need to work on the awareness
> of cost associated with all this cgroup nonsense, people seem to think
> its all good and free -- or not think at all, which, while depressing,
> seem the more likely option.

In defense of what libvirt is doing, I'll point out that the kernel
docs on cgroups make little to no mention of these performance / cost
implications, and the examples of usage given arguably encourage use
of deep hierarchies.

Given what we've now learnt about the kernel's lack of scalability
wrt cgroup hierarchies, we'll be changing the way libvirt deals with
cgroups, to flatten it out to only use 1 level by default. If the
kernel docs had clearly expressed the limitations & made better
recommendations on app usage we would never have picked the approach
we originally chose.

Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/