Re: [PATCH RFC v1 0/2] iommu&pci: Disable ATS during FLR resets

From: Robin Murphy
Date: Tue Jun 10 2025 - 11:38:55 EST


On 2025-06-09 7:45 pm, Nicolin Chen wrote:
Hi all,

Per PCIe r6.3, sec 10.3.1 IMPLEMENTATION NOTE, software should disable ATS
before initiating a Function Level Reset, and then ensure no invalidation
requests being issued to a device when its ATS capability is disabled.

Not really - what it says is that software should not expect to receive invalidate completions from a function which is in the process of being reset or powered off, and if software doesn't want to be confused by that then it should take care to wait for completion or timeout of all outstanding requests, and avoid issuing new requests, before initiating such a reset or power transition.

Both pci_enable_ats() and pci_disable_ats() are called by an IOMMU driver,
but an unsolicited FLR can happen at any time in the PCI layer. This might
result in a race between them, breaking the rules given by the PCIe Spec.

Can you clarify which rules? The Implementation Note itself is just an example of a possible software policy, and explicitly not normative.

Therefore, there needs to be a sync between IOMMU and PCI subsystems, to
ensure that ATS will be disabled and never gets re-enabled until the FLR
finishes. Add a pair of new IOMMU helpers for PCI reset functions to call
before and after the reset routines. These two helpers will temporally
attach the device's RID/PASID to IOMMU_DOMAIN_BLOCKED, which should allow
its IOMMU driver to pause any DMA traffic and disable ATS feature until
the FLR is done.

FLR must inherently stop the function from issuing any kind of requests anyway (see 6.6.2), so "pausing" traffic at the IOMMU end seems like a non-issue. I guess I can see how messing with the domain attachment underneath the rest of the group manages to prevent new invalidate requests from group->domain being issued to the given function, but it's pretty horrid - leaving the mutex blocked might be just about tolerable for an FLR that's supposed to take no longer than 100ms, but what if we do want to do this for suspend/resume as well?

Thanks,
Robin.


This is on Github:
https://github.com/nicolinc/iommufd/commits/iommu_dev_reset-rfcv1

Thanks
Nicolin

Nicolin Chen (2):
iommu: Introduce iommu_dev_reset_prepare() and iommu_dev_reset_done()
pci: Suspend ATS before doing FLR

include/linux/iommu.h | 12 +++++
drivers/iommu/iommu.c | 106 ++++++++++++++++++++++++++++++++++++++++++
drivers/pci/pci.c | 42 +++++++++++++++--
3 files changed, 156 insertions(+), 4 deletions(-)