Re: [RFC][PATCH 00/11] blkiocg async support

From: Daniel P. Berrange
Date: Fri Jul 16 2010 - 10:54:27 EST


On Fri, Jul 16, 2010 at 10:35:36AM -0400, Vivek Goyal wrote:
> On Fri, Jul 16, 2010 at 03:15:49PM +0100, Daniel P. Berrange wrote:
> Secondly, just because some controller allows creation of hierarchy does
> not mean that hierarchy is being enforced. For example, memory controller.
> IIUC, one needs to explicitly set "use_hierarchy" to enforce hierarchy
> otherwise effectively it is flat. So if libvirt is creating groups and
> putting machines in child groups thinking that we are not interfering
> with admin's policy, is not entirely correct.

That is true, but that 'use_hierarchy' at least provides admins
the mechanism required to implement the neccessary policy

> So how do we make progress here. I really want to see blkio controller
> integrated with libvirt.
>
> About the issue of hierarchy, I can probably travel down the path of allowing
> creation of hierarchy but CFQ will treat it as flat. Though I don't like it
> because it will force me to introduce variables like "use_hierarchy" once
> real hierarchical support comes in but I guess I can live with that.
> (Anyway memory controller is already doing it.).
>
> There is another issue though and that is by default every virtual
> machine going into a group of its own. As of today, it can have
> severe performance penalties (depending on workload) if group is not
> driving doing enough IO. (Especially with group_isolation=1).
>
> I was thinking of a model where an admin moves out the bad virtual
> machines in separate group and limit their IO.

In the simple / normal case I imagine all guests VMs will be running
unrestricted I/O initially. Thus instead of creating the cgroup at time
of VM startup, we could create the cgroup only when the admin actually
sets an I/O limit. IIUC, this should maintain the one cgroup per guest
model, while avoiding the performance penalty in normal use. The caveat
of course is that this would require blkio controller to have a dedicated
mount point, not shared with other controller. I think we might also
want this kind of model for net I/O, since we probably don't want to
creating TC classes + net_cls groups for every VM the moment it starts
unless the admin has actually set a net I/O limit.

Daniel
--
|: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :|
|: http://libvirt.org -o- http://virt-manager.org -o- http://deltacloud.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/