Re: [RFC 4/6] dma-buf: Add DMA-BUF exporter op to charge a DMA-BUF to a cgroup.

From: Christian König
Date: Wed Jan 19 2022 - 10:59:09 EST


Am 19.01.22 um 16:54 schrieb Daniel Vetter:
On Tue, Jan 18, 2022 at 10:54:16AM -0800, Hridya Valsaraju wrote:
On Sun, Jan 16, 2022 at 11:46 PM Christian König
<christian.koenig@xxxxxxx> wrote:
Am 15.01.22 um 02:06 schrieb Hridya Valsaraju:
The optional exporter op provides a way for processes to transfer
charge of a buffer to a different process. This is essential for the
cases where a central allocator process does allocations for various
subsystems, hands over the fd to the client who
requested the memory and drops all references to the allocated memory.

Signed-off-by: Hridya Valsaraju <hridya@xxxxxxxxxx>
---
include/linux/dma-buf.h | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index 7ab50076e7a6..d5e52f81cc6f 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -13,6 +13,7 @@
#ifndef __DMA_BUF_H__
#define __DMA_BUF_H__

+#include <linux/cgroup_gpu.h>
#include <linux/dma-buf-map.h>
#include <linux/file.h>
#include <linux/err.h>
@@ -285,6 +286,23 @@ struct dma_buf_ops {

int (*vmap)(struct dma_buf *dmabuf, struct dma_buf_map *map);
void (*vunmap)(struct dma_buf *dmabuf, struct dma_buf_map *map);
+
+ /**
+ * @charge_to_cgroup:
+ *
+ * This is called by an exporter to charge a buffer to the specified
+ * cgroup.
Well that sentence makes absolutely no sense at all.

The dma_buf_ops are supposed to be called by the DMA-buf subsystem on
behalves of the importer and never by the exporter itself.

I hope that this is just a documentation mixup.
Thank you for taking a look Christian!

Yes, that was poor wording, sorry about that. It should instead say
that the op would be called by the process the buffer is currently
charged to in order to transfer the buffer's charge to a different
cgroup. This is helpful in the case where a process acts as an
allocator for multiple client processes and we would like the
allocated buffers to be charged to the clients who requested their
allocation(instead of the allocating process as is the default
behavior). In Android, the graphics allocator HAL process[1] does
most of the graphics allocations on behalf of various clients. After
allocation, the HAL process passes the fd to the client over binder
IPC and the binder driver invokes the charge_to_cgroup() DMA-BUF op to
uncharge the buffer from the HAL process and charge it to the client
process instead.

[1]: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.android.com%2Fdevices%2Fgraphics%2Farch-bq-gralloc&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7C838d25da974d4ea4257508d9db63eb70%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637782044488604857%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=Qn7JeyF5Rq9tnrGw1KgNuQkpu5RbcrvPhDOa1OBJ6TU%3D&amp;reserved=0
For that use-case, do we really need to have the vfunc abstraction and
force all exporters to do something reasonable with it?

I was about to write up a similar answer, but more from the technical side.

Why in the world should that be done on the DMA-buf object as a communication function between importer and exporter?

That design makes absolutely no sense at all to me.

Regards,
Christian.


I think just storing the cgrpus gpu memory bucket this is charged against
and doing this in a generic way would be a lot better.

That way we can also easily add other neat features in the future, like
e.g. ttm could take care of charge-assignement automatically maybe, or we
could print the current cgroups charge relationship in the sysfs info
file. Or anything else really.

I do feel that in general for gpu memory cgroups to be useful, we should
really have memory pools as a fairly strong concept. Otherwise every
driver/allocator/thing is going to come up with their own, and very likely
incompatible interpretation. And we end up with a supposed generic cgroups
interface which cannot actually be used in a driver/vendor agnostic way at
all.
-Daniel

Regards,
Hridya


Regards,
Christian.

The caller must hold a reference to @gpucg obtained via
+ * gpucg_get(). The DMA-BUF will be uncharged from the cgroup it is
+ * currently charged to before being charged to @gpucg. The caller must
+ * belong to the cgroup the buffer is currently charged to.
+ *
+ * This callback is optional.
+ *
+ * Returns:
+ *
+ * 0 on success or negative error code on failure.
+ */
+ int (*charge_to_cgroup)(struct dma_buf *dmabuf, struct gpucg *gpucg);
};

/**