Re: [PATCH 0/2] Introduce PCI_FIXUP_IOMMU

From: Zhangfei Gao
Date: Thu Jun 18 2020 - 22:27:38 EST


Hi, Bjorn

On 2020/6/16 äå7:52, Bjorn Helgaas wrote:
On Sat, Jun 13, 2020 at 10:30:56PM +0800, Zhangfei Gao wrote:
On 2020/6/11 äå9:44, Bjorn Helgaas wrote:
+++ b/drivers/iommu/iommu.c
@@ -2418,6 +2418,10 @@ int iommu_fwspec_init(struct device *dev, struct
fwnode_handle *iommu_fwnode,
fwspec->iommu_fwnode = iommu_fwnode;
fwspec->ops = ops;
dev_iommu_fwspec_set(dev, fwspec);
+
+ if (dev_is_pci(dev))
+ pci_fixup_device(pci_fixup_final, to_pci_dev(dev));
+

Then pci_fixup_final will be called twice, the first in pci_bus_add_device.
Here in iommu_fwspec_init is the second time, specifically for iommu_fwspec.
Will send this when 5.8-rc1 is open.
Wait, this whole fixup approach seems wrong to me. No matter how you
do the fixup, it's still a fixup, which means it requires ongoing
maintenance. Surely we don't want to have to add the Vendor/Device ID
for every new AMBA device that comes along, do we?

Here the fake pci device has standard PCI cfg space, but physical
implementation is base on AMBA
They can provide pasid feature.
However,
1, does not support tlp since they are not real pci devices.
2. does not support pri, instead support stall (provided by smmu)
And stall is not a pci feature, so it is not described in struct pci_dev,
but in struct iommu_fwspec.
So we use this fixup to tell pci system that the devices can support stall,
and hereby support pasid.
This did not answer my question. Are you proposing that we update a
quirk every time a new AMBA device is released? I don't think that
would be a good model.
Yes, you are right, but we do not have any better idea yet.
Currently we have three fake pci devices, which support stall and pasid.
We have to let pci system know the device can support pasid, because of
stall feature, though not support pri.
Do you have any other ideas?
It sounds like the best way would be to allocate a PCI capability for it, so
detection can be done through config space, at least in future devices,
or possibly after a firmware update if the config space in your system
is controlled by firmware somewhere. Once there is a proper mechanism
to do this, using fixups to detect the early devices that don't use that
should be uncontroversial. I have no idea what the process or timeline
is to add new capabilities into the PCIe specification, or if this one
would be acceptable to the PCI SIG at all.
That sounds like a possibility. The spec already defines a
Vendor-Specific Extended Capability (PCIe r5.0, sec 7.9.5) that might
be a candidate.
Will investigate this, thanks Bjorn
FWIW, there's also a Vendor-Specific Capability that can appear in the
first 256 bytes of config space (the Vendor-Specific Extended
Capability must appear in the "Extended Configuration Space" from
0x100-0xfff).
Unfortunately our silicon does not have either Vendor-SpecificÂCapability or
Vendor-SpecificÂExtended Capability.

Studied commit 8531e283bee66050734fb0e89d53e85fd5ce24a4
Looks this method requires adding member (like can_stall) to struct pci_dev,
looks difficult.
The problem is that we don't want to add device IDs every time a new
chip comes out. Adding one or two device IDs for silicon that's
already released is not a problem as long as you have a strategy for
*future* devices so they don't require a quirk.

If detection cannot be done through PCI config space, the next best
alternative is to pass auxiliary data through firmware. On DT based
machines, you can list non-hotpluggable PCIe devices and add custom
properties that could be read during device enumeration. I assume
ACPI has something similar, but I have not done that.
Yes, thanks Arnd
ACPI has _DSM (ACPI v6.3, sec 9.1.1), which might be a candidate. I
like this better than a PCI capability because the property you need
to expose is not a PCI property.
_DSM may not workable, since it is working in runtime.
We need stall information in init stage, neither too early (after allocation
of iommu_fwspec)
nor too late (before arm_smmu_add_device ).
I'm not aware of a restriction on when _DSM can be evaluated. I'm
looking at ACPI v6.3, sec 9.1.1. Are you seeing something different?
DSM method seems requires vendor specific guid, and code would be vendor
specific.
_DSM indeed requires a vendor-specific UUID, precisely *because*
vendors are free to define their own functionality without requiring
changes to the ACPI spec. From the spec (ACPI v6.3, sec 9.1.1):

New UUIDs may also be created by OEMs and IHVs for custom devices
and other interface or device governing bodies (e.g. the PCI SIG),
as long as the UUID is different from other published UUIDs.
Have studied _DSM method, two issues we met comparing using quirk.

1. Need change definition of either pci_host_bridge or pci_dev, like adding member can_stall,
while pci system does not know stall now.

a, pci devices do not have uuid: uuid need be described in dsdt, while pci devices are not defined in dsdt.
ÂÂÂ so we have to use host bridge.

b, Parsing dsdt is in in pci subsystem.
Like drivers/acpi/pci_root.c:
ÂÂÂÂÂÂ obj = acpi_evaluate_dsm(ACPI_HANDLE(bus->bridge), &pci_acpi_dsm_guid, 1,
ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂ IGNORE_PCI_BOOT_CONFIG_DSM, NULL);

After parsing DSM in pci, we need record this info.
Currently, can_stall info is recorded in iommu_fwspec,
which is allocated in iommu_fwspec_init and called by iort_iommu_configure for uefi.

2. Guest kernel also need support sva.
Using quirk, the guest can boot with sva enabled, since quirk is self-contained by kernel.
If using _DSM, a specific uefi or dtb has to be provided,
currently we can useQEMU_EFI.fd from apt install qemu-efi


Thanks