Re: [PATCH v4 0/4] Implement dmabuf direct I/O via copy_file_range

From: Christoph Hellwig
Date: Mon Jun 09 2025 - 00:39:25 EST


On Fri, Jun 06, 2025 at 01:20:48PM +0200, Christian König wrote:
> > dmabuf acts as a driver and shouldn't be handled by VFS, so I made
> > dmabuf implement copy_file_range callbacks to support direct I/O
> > zero-copy. I'm open to both approaches. What's the preference of
> > VFS experts?
>
> That would probably be illegal. Using the sg_table in the DMA-buf
> implementation turned out to be a mistake.

Two thing here that should not be directly conflated. Using the
sg_table was a huge mistake, and we should try to move dmabuf to
switch that to a pure dma_addr_t/len array now that the new DMA API
supporting that has been merged. Is there any chance the dma-buf
maintainers could start to kick this off? I'm of course happy to
assist.

But that notwithstanding, dma-buf is THE buffer sharing mechanism in
the kernel, and we should promote it instead of reinventing it badly.
And there is a use case for having a fully DMA mapped buffer in the
block layer and I/O path, especially on systems with an IOMMU.
So having an iov_iter backed by a dma-buf would be extremely helpful.
That's mostly lib/iov_iter.c code, not VFS, though.

> The question Christoph raised was rather why is your CPU so slow
> that walking the page tables has a significant overhead compared to
> the actual I/O?

Yes, that's really puzzling and should be addressed first.