Re: [RFC PATCH 15/18] cgroup: Introduce ioasids controller

From: Jean-Philippe Brucker
Date: Fri Mar 05 2021 - 03:32:11 EST


On Thu, Mar 04, 2021 at 09:46:03AM -0800, Jacob Pan wrote:
> Hi Jean-Philippe,
>
> On Thu, 4 Mar 2021 10:49:37 +0100, Jean-Philippe Brucker
> <jean-philippe@xxxxxxxxxx> wrote:
>
> > On Wed, Mar 03, 2021 at 04:02:05PM -0800, Jacob Pan wrote:
> > > Hi Jacob,
> > >
> > > On Wed, 3 Mar 2021 13:17:26 -0800, Jacob Pan
> > > <jacob.jun.pan@xxxxxxxxxxxxxxx> wrote:
> > >
> > > > Hi Tejun,
> > > >
> > > > On Wed, 3 Mar 2021 10:44:28 -0500, Tejun Heo <tj@xxxxxxxxxx> wrote:
> > > >
> > > > > On Sat, Feb 27, 2021 at 02:01:23PM -0800, Jacob Pan wrote:
> > > > > > IOASIDs are used to associate DMA requests with virtual address
> > > > > > spaces. They are a system-wide limited resource made available to
> > > > > > the userspace applications. Let it be VMs or user-space device
> > > > > > drivers.
> > > > > >
> > > > > > This RFC patch introduces a cgroup controller to address the
> > > > > > following problems:
> > > > > > 1. Some user applications exhaust all the available IOASIDs thus
> > > > > > depriving others of the same host.
> > > > > > 2. System admins need to provision VMs based on their needs for
> > > > > > IOASIDs, e.g. the number of VMs with assigned devices that perform
> > > > > > DMA requests with PASID.
> > > > >
> > > > > Please take a look at the proposed misc controller:
> > > > >
> > > > > http://lkml.kernel.org/r/20210302081705.1990283-2-vipinsh@xxxxxxxxxx
> > > > >
> > > > > Would that fit your bill?
> > > > The interface definitely can be reused. But IOASID has a different
> > > > behavior in terms of migration and ownership checking. I guess SEV key
> > > > IDs are not tied to a process whereas IOASIDs are. Perhaps this can be
> > > > solved by adding
> > > > + .can_attach = ioasids_can_attach,
> > > > + .cancel_attach = ioasids_cancel_attach,
> > > > Let me give it a try and come back.
> > > >
> > > While I am trying to fit the IOASIDs cgroup in to the misc cgroup
> > > proposal. I'd like to have a direction check on whether this idea of
> > > using cgroup for IOASID/PASID resource management is viable.
> >
> > Yes, even for host SVA it would be good to have a cgroup. Currently the
> > number of shared address spaces is naturally limited by number of
> > processes, which can be controlled with rlimit and cgroup. But on Arm the
> > hardware limit on shared address spaces is 64k (number of ASIDs), easily
> > exhausted with the default PASID and PID limits. So a cgroup for managing
> > this resource is more than welcome.
> >
> > It looks like your current implementation is very dependent on
> > IOASID_SET_TYPE_MM? I'll need to do more reading about cgroup to see how
> > easily it can be adapted to host SVA which uses IOASID_SET_TYPE_NULL.
> >
> Right, I was assuming have three use cases of IOASIDs:
> 1. host supervisor SVA (not a concern, just one init_mm to bind)
> 2. host user SVA, either one IOASID per process or perhaps some private
> IOASID for private address space
> 3. VM use for guest SVA, each IOASID is bound to a guest process
>
> My current cgroup proposal applies to #3 with IOASID_SET_TYPE_MM, which is
> allocated by the new /dev/ioasid interface.
>
> For #2, I was thinking you can limit the host process via PIDs cgroup? i.e.
> limit fork.

That works but isn't perfect, because the hardware resource of shared
address spaces can be much lower that PID limit - 16k ASIDs on Arm. To
allow an admin to fairly distribute that resource we could introduce
another cgroup just to limit the number of shared address spaces, but
limiting the number of IOASIDs does the trick.

> So the host IOASIDs are currently allocated from the system pool
> with quota of chosen by iommu_sva_init() in my patch, 0 means unlimited use
> whatever is available. https://lkml.org/lkml/2021/2/28/18

Yes that's sensible, but it would be good to plan the cgroup user
interface to work for #2 as well, even if we don't implement it right
away.

Thanks,
Jean