Re: [RFC 10/20] iommu/iommufd: Add IOMMU_DEVICE_GET_INFO

From: hch@xxxxxx
Date: Thu Sep 30 2021 - 23:28:23 EST


On Thu, Sep 30, 2021 at 07:04:46PM -0300, Jason Gunthorpe wrote:
> > On Arm cache coherency is configured through PTE attributes. I don't think
> > PCI No_snoop should be used because it's not necessarily supported
> > throughout the system and, as far as I understand, software can't discover
> > whether it is.
>
> The usage of no-snoop is a behavior of a device. A generic PCI driver
> should be able to program the device to generate no-snoop TLPs and
> ideally rely on an arch specific API in the OS to trigger the required
> cache maintenance.

Well, it is a combination of the device, the root port and the driver
which all need to be in line to use this.

> It doesn't make much sense for a portable driver to rely on a
> non-portable IO PTE flag to control coherency, since that is not a
> standards based approach.
>
> That said, Linux doesn't have a generic DMA API to support
> no-snoop. The few GPUs drivers that use this stuff just hardwired
> wbsync on Intel..

Yes, as usual the GPU folks come up with nasty hacks instead of
providing generic helper. Basically all we'd need to support it
in a generic way is:

- a DMA_ATTR_NO_SNOOP (or DMA_ATTR_FORCE_NONCOHERENT to fit the Linux
terminology) which treats the current dma_map/unmap/sync calls as
if dev_is_dma_coherent was false
- a way for the driver to discover that a given architecture / running
system actually supports this

> What I don't really understand is why ARM, with an IOMMU that supports
> PTE WB, has devices where dev_is_dma_coherent() == false ?

Because no IOMMU in the world can help that fact that a periphal on the
SOC is not part of the cache coherency protocol.