Re: [PATCH v4 04/13] PCI: portdrv: Suppress kernel DMA ownership auto-claiming

From: Lu Baolu
Date: Thu Dec 30 2021 - 00:50:39 EST


Hi Bjorn,

On 12/30/21 5:16 AM, Bjorn Helgaas wrote:
On Fri, Dec 17, 2021 at 02:36:59PM +0800, Lu Baolu wrote:
IOMMU grouping on PCI necessitates that if we lack isolation on a bridge
then all of the downstream devices will be part of the same IOMMU group
as the bridge. The existing vfio framework allows the portdrv driver to
be bound to the bridge while its downstream devices are assigned to user
space. The pci_dma_configure() marks the iommu_group as containing only
devices with kernel drivers that manage DMA. Avoid this default behavior
for the portdrv driver in order for compatibility with the current vfio
policy.

A word about the isolation would be useful. I think you're referring
to some specific ACS controls, probably P2P Request Redirect?

I guess this is just a wording issue, but I think it's actually the
*lack* of some ACS controls that forces us to put several devices in
the same IOMMU group, isn't it? It's not that we start with "IOMMU
grouping" and that necessitates something else.

Maybe something like this?

If a switch lacks ACS P2P Request Redirect (and possibly other
controls?), a device below the switch can bypass the IOMMU and DMA
directly to other devices below the switch, so all the downstream
devices must be in the same IOMMU group as the switch itself.

Yes. That's what it means from the perspective of PCI/PCIe. I will use
this in the next version. Thanks!


The commit 5f096b14d421b ("vfio: Whitelist PCI bridges") extended above
policy to all kernel drivers of bridge class. This is not always safe.
For example, The shpchp_core driver relies on the PCI MMIO access for the
controller functionality. With its downstream devices assigned to the
userspace, the MMIO might be changed through user initiated P2P accesses
without any notification. This might break the kernel driver integrity
and lead to some unpredictable consequences.

For any bridge driver, in order to avoiding default kernel DMA ownership
claiming, we should consider:

1) Does the bridge driver use DMA? Calling pci_set_master() or
a dma_map_* API is a sure indicate the driver is doing DMA

2) If the bridge driver uses MMIO, is it tolerant to hostile
userspace also touching the same MMIO registers via P2P DMA
attacks?

Conservatively if the driver maps an MMIO region at all, we can say that
it fails the test.

I'm not sure what all this explanation is telling me. It says
something done by 5f096b14d421 is not always safe, but this patch
doesn't fix any of those unsafe things.

If it doesn't explain why we need this patch or how this patch works,
I don't think we need it in the commit log.

Maybe this is an explanation for why you didn't set
.suppress_auto_claim_dma_owner for shpc_driver?

You are right. This doesn't explain why this is needed and how it works.
It only explains why we don't do the same thing to other pci port
drivers. I will move this out of the commit message. Perhaps put it
in the cover letter or some patches for vifo.


Minor typos above:
s/in order to avoiding default/before avoiding default/
s/relies on the PCI MMIO access/relies on PCI MMIO access/
s/For example, The/For example, the/
s/is a sure indicate the/is a sure indication the/

Thank you! I will correct these.


Suggested-by: Jason Gunthorpe <jgg@xxxxxxxxxx>
Suggested-by: Kevin Tian <kevin.tian@xxxxxxxxx>
Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
---
drivers/pci/pcie/portdrv_pci.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 35eca6277a96..c48a8734f9c4 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -202,7 +202,10 @@ static struct pci_driver pcie_portdriver = {
.err_handler = &pcie_portdrv_err_handler,
- .driver.pm = PCIE_PORTDRV_PM_OPS,
+ .driver = {
+ .pm = PCIE_PORTDRV_PM_OPS,
+ .suppress_auto_claim_dma_owner = true,
+ },
};
static int __init dmi_pcie_pme_disable_msi(const struct dmi_system_id *d)
--
2.25.1


Best regards,
baolu