Re: [libvirt] [RFC]Add new mdev interface for QoS

From: Alex Williamson
Date: Thu Jul 27 2017 - 14:02:09 EST


On Thu, 27 Jul 2017 17:17:48 +0100
"Daniel P. Berrange" <berrange@xxxxxxxxxx> wrote:

> On Wed, Jul 26, 2017 at 10:43:43AM -0600, Alex Williamson wrote:
> > [cc +libvir-list]
> >
> > On Wed, 26 Jul 2017 21:16:59 +0800
> > "Gao, Ping A" <ping.a.gao@xxxxxxxxx> wrote:
> >
> > > The vfio-mdev provide the capability to let different guest share the
> > > same physical device through mediate sharing, as result it bring a
> > > requirement about how to control the device sharing, we need a QoS
> > > related interface for mdev to management virtual device resource.
> > >
> > > E.g. In practical use, vGPUs assigned to different quests almost has
> > > different performance requirements, some guests may need higher priority
> > > for real time usage, some other may need more portion of the GPU
> > > resource to get higher 3D performance, corresponding we can define some
> > > interfaces like weight/cap for overall budget control, priority for
> > > single submission control.
> > >
> > > So I suggest to add some common attributes which are vendor agnostic in
> > > mdev core sysfs for QoS purpose.
> >
> > I think what you're asking for is just some standardization of a QoS
> > attribute_group which a vendor can optionally include within the
> > existing mdev_parent_ops.mdev_attr_groups. The mdev core will
> > transparently enable this, but it really only provides the standard,
> > all of the support code is left for the vendor. I'm fine with that,
> > but of course the trouble with and sort of standardization is arriving
> > at an agreed upon standard. Are there QoS knobs that are generic
> > across any mdev device type? Are there others that are more specific
> > to vGPU? Are there existing examples of this that we can steal their
> > specification?
> >
> > Also, mdev devices are not necessarily the exclusive users of the
> > hardware, we can have a native user such as a local X client. They're
> > not an mdev user, so we can't support them via the mdev_attr_group.
> > Does there need to be a per mdev parent QoS attribute_group standard
> > for somehow defining the QoS of all the child mdev devices, or perhaps
> > representing the remaining host QoS attributes?
> >
> > Ultimately libvirt and upper level management tools would be the
> > consumer of these control knobs, so let's immediately get libvirt
> > involved in the discussion. Thanks,
>
> My view on this from libvirt side is pretty much unchanged since the
> last time we discussed this.
>
> We would like the kernel maintainers to define standard sets of properties
> for mdevs, whether global to all mdevs, or scoped to certain classes of
> mdev (eg a class=gpu). These properties would be exported in sysfs, with
> one file per property.

Yes, I think that much of the mechanics are obvious (standardized
sysfs layout, one property per file, properties under the device node
in sysfs, etc). Are you saying that you don't want to be consulted on
which properties are exposed and how they operate and therefore won't
complain regardless of what we implement in the kernel? ;)

I'm hoping that libvirt folks have some experience managing basic
scheduling level QoS attributes and might have some input as to what
sorts of things work well vs what seems like a good idea, but falls
apart or isn't useful in practice.

> Libvirt can then explicitly map each standardized property into a suitable
> XML element, to report on which properties are available to use when creating
> an mdev. It would then allow them to be set at time of creation, and/or
> changed on the fly for existing mdevs.
>
> Specifically we would like to avoid generic passthrough of arbitrary
> vendor specific properties.

Of course, I think that's the intent here. Thanks,

Alex