Re: [PATCHv11 0/3] rdmacg: IB/core: rdma controller support

From: Tejun Heo
Date: Wed Aug 24 2016 - 17:17:56 EST


(cc'ing Christoph)

On Mon, Aug 22, 2016 at 06:03:48PM +0530, Parav Pandit wrote:
> rdmacg: IB/core: rdma controller support
>
> Patch is generated and tested against below Doug's linux-rdma
> git tree.
>
> URL: git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma.git
> Branch: master
>
> Patchset is also compiled and tested against below Tejun's cgroup tree
> using cgroup v2 mode.
> URL: git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git
> Branch: master
>
> Overview:
> Currently user space applications can easily take away all the rdma
> device specific resources such as AH, CQ, QP, MR etc. Due to which other
> applications in other cgroup or kernel space ULPs may not even get chance
> to allocate any rdma resources. This results into service unavailibility.
>
> RDMA cgroup addresses this issue by allowing resource accounting,
> limit enforcement on per cgroup, per rdma device basis.
>
> RDMA uverbs layer will enforce limits on well defined RDMA verb
> resources without any HCA vendor device driver involvement.
>
> RDMA uverbs layer will not do limit enforcement of HCA hw vendor
> specific resources. Instead rdma cgroup provides set of APIs
> through which vendor specific drivers can do resource accounting
> by making use of rdma cgroup.
>
> Resource limit enforcement is hierarchical.
>
> When process is migrated with active RDMA resources, rdma cgroup
> continues to uncharge original cgroup for allocated resource. New resource
> is charged to current process's cgroup, which means if the process is
> migrated with active resources, for new resources it will be charged to
> new cgroup and old resources will be correctly uncharged from old cgroup.
>
> Changes from v10:
> * (To address comments from Tejun, Christoph)
> 1. Removed unused rpool_list_lock from rdma_cgroup structure.
> 2. Moved rdma resource definition to rdma cgroup instead of IB stack
> 3. Added prefix rdmacg to static instances
> 4. Simplified locking with single mutex for all operations
> 5. Following approach of atomically allocating object and
> charging resource in hirerchy
> 6. Code simplification due to single lock
> 7. Using for_each_set_bit API for bit operation
> 8. Renamed list heads as Objects instead of _head
> 9. Renamed list entries as _node instead of _list.
> 10. Made usage_num to 64 bit to avoid overflow and to avoid
> additional code to track non zero number of usage counts.
> * (To address comments from Doug)
> 1. Added copyright and GPLv2 license

Looks good to me. I just have a nit in the documentation. Christoph,
what do you think?

Thanks.

--
tejun