Re: [PATCH v4] dma-buf: Add DmaBufTotal counter in meminfo

From: Christian König
Date: Tue Apr 20 2021 - 03:32:32 EST


Am 20.04.21 um 09:04 schrieb Michal Hocko:
On Mon 19-04-21 18:37:13, Christian König wrote:
Am 19.04.21 um 18:11 schrieb Michal Hocko:
[...]
The question is not whether it is NUMA aware but whether it is useful to
know per-numa data for the purpose the counter is supposed to serve.
No, not at all. The pages of a single DMA-buf could even be from different
NUMA nodes if the exporting driver decides that this is somehow useful.
As the use of the counter hasn't been explained yet I can only
speculate. One thing that I can imagine to be useful is to fill gaps in
our accounting. It is quite often that the memroy accounted in
/proc/meminfo (or oom report) doesn't add up to the overall memory
usage. In some workloads the workload can be huge! In many cases there
are other means to find out additional memory by a subsystem specific
interfaces (e.g. networking buffers). I do assume that dma-buf is just
one of those and the counter can fill the said gap at least partially
for some workloads. That is definitely useful.

Yes, completely agree. I'm just not 100% sure if the DMA-buf framework should account for that or the individual drivers exporting DMA-bufs.

See below for a further explanation.

What I am trying to bring up with NUMA side is that the same problem can
happen on per-node basis. Let's say that some user consumes unexpectedly
large amount of dma-buf on a certain node. This can lead to observable
performance impact on anybody on allocating from that node and even
worse cause an OOM for node bound consumers. How do I find out that it
was dma-buf that has caused the problem?

Yes, that is the direction my thinking goes as well, but also even further.

See DMA-buf is also used to share device local memory between processes as well. In other words VRAM on graphics hardware.

On my test system here I have 32GB of system memory and 16GB of VRAM. I can use DMA-buf to allocate that 16GB of VRAM quite easily which then shows up under /proc/meminfo as used memory.

But that isn't really system memory at all, it's just allocated device memory.

See where I am heading?

Yeah, totally. Thanks for pointing this out.

Suggestions how to handle that?

Regards,
Christian.