Re: [PATCH] PCI/ACPI: do not reference a pci device after it has been released

From: Rafael J. Wysocki
Date: Fri Sep 09 2022 - 17:18:58 EST


On Friday, September 9, 2022 9:42:53 AM CEST Greg Kroah-Hartman wrote:
> On Mon, Jun 27, 2022 at 06:37:06PM +0200, Rafael J. Wysocki wrote:
> > On Mon, Jun 27, 2022 at 5:07 PM Greg Kroah-Hartman
> > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Thu, Apr 28, 2022 at 10:30:38PM +0200, Rafael J. Wysocki wrote:
> > > > On Thu, Apr 28, 2022 at 10:15 PM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > > > >
> > > > > On Thu, Apr 28, 2022 at 6:22 PM Greg Kroah-Hartman
> > > > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > > >
> > > > > > On Thu, Apr 28, 2022 at 10:58:58AM -0500, Bjorn Helgaas wrote:
> > > > > > > On Thu, Apr 28, 2022 at 04:28:53PM +0200, Greg Kroah-Hartman wrote:
> > > > > > > > In acpi_get_pci_dev(), the debugging message for when a PCI bridge is
> > > > > > > > not found uses a pointer to a pci device whose reference has just been
> > > > > > > > dropped. The chance that this really is a device that is now been
> > > > > > > > removed from the system is almost impossible to happen, but to be safe,
> > > > > > > > let's print out the debugging message based on the acpi root device
> > > > > > > > which we do have a valid reference to at the moment.
> > > > > > >
> > > > > > > This code was added by 497fb54f578e ("ACPI / PCI: Fix NULL pointer
> > > > > > > dereference in acpi_get_pci_dev() (rev. 2)"). Not sure if it's worth
> > > > > > > a Fixes: tag.
> > > > > >
> > > > > > Can't hurt, I'll add it for the v2 based on this review.
> > > > > >
> > > > > > >
> > > > > > > acpi_get_pci_dev() is used by only five callers, three of which are
> > > > > > > video/backlight related. I'm always skeptical of one-off interfaces
> > > > > > > like this, but I don't know enough to propose any refactoring or other
> > > > > > > alternatives.
> > > > > > >
> > > > > > > I'll leave this for Rafael, but if I were applying I would silently
> > > > > > > touch up the subject to match convention:
> > > > > > >
> > > > > > > PCI/ACPI: Do not reference PCI device after it has been released
> > > > > >
> > > > > > Much simpler, thanks.
> > > > > >
> > > > > > >
> > > > > > > > Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>
> > > > > > > > Cc: "Rafael J. Wysocki" <rafael@xxxxxxxxxx>
> > > > > > > > Cc: Len Brown <lenb@xxxxxxxxxx>
> > > > > > > > Cc: linux-pci@xxxxxxxxxxxxxxx
> > > > > > > > Cc: linux-acpi@xxxxxxxxxxxxxxx
> > > > > > > > Reported-by: whitehat002 <hackyzh002@xxxxxxxxx>
> > > > > > > > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > > > > ---
> > > > > > > > drivers/acpi/pci_root.c | 3 ++-
> > > > > > > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > > > >
> > > > > > > > diff --git a/drivers/acpi/pci_root.c b/drivers/acpi/pci_root.c
> > > > > > > > index 6f9e75d14808..ecda378dbc09 100644
> > > > > > > > --- a/drivers/acpi/pci_root.c
> > > > > > > > +++ b/drivers/acpi/pci_root.c
> > > > > > > > @@ -303,7 +303,8 @@ struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
> > > > > > > > * case pdev->subordinate will be NULL for the parent.
> > > > > > > > */
> > > > > > > > if (!pbus) {
> > > > > > > > - dev_dbg(&pdev->dev, "Not a PCI-to-PCI bridge\n");
> > > > > > > > + dev_dbg(&root->device->dev,
> > > > > > > > + "dev %d, function %d is not a PCI-to-PCI bridge\n", dev, fn);
> > > > > > >
> > > > > > > This should use "%02x.%d" to be consistent with the dev_set_name() in
> > > > > > > pci_setup_device().
> > > > > >
> > > > > > Ah, missed that, will change it and send out a new version tomorrow.
> > > > >
> > > > > I would make the change below (modulo the gmail-induced wthite space
> > > > > breakage), though.
> > > >
> > > > That said ->
> > > >
> > > > > ---
> > > > > drivers/acpi/pci_root.c | 5 +++--
> > > > > 1 file changed, 3 insertions(+), 2 deletions(-)
> > > > >
> > > > > Index: linux-pm/drivers/acpi/pci_root.c
> > > > > ===================================================================
> > > > > --- linux-pm.orig/drivers/acpi/pci_root.c
> > > > > +++ linux-pm/drivers/acpi/pci_root.c
> > > > > @@ -295,8 +295,6 @@ struct pci_dev *acpi_get_pci_dev(acpi_ha
> > > > > break;
> > > > >
> > > > > pbus = pdev->subordinate;
> > > > > - pci_dev_put(pdev);
> > > > > -
> > > > > /*
> > > > > * This function may be called for a non-PCI device that has a
> > > > > * PCI parent (eg. a disk under a PCI SATA controller). In that
> > > > > @@ -304,9 +302,12 @@ struct pci_dev *acpi_get_pci_dev(acpi_ha
> > > > > */
> > > > > if (!pbus) {
> > > > > dev_dbg(&pdev->dev, "Not a PCI-to-PCI bridge\n");
> > > > > + pci_dev_put(pdev);
> > > > > pdev = NULL;
> > > > > break;
> > > > > }
> > > > > +
> > > > > + pci_dev_put(pdev);
> > > >
> > > > -> we are going to use pbus after this and it is pdev->subordinate
> > > > which cannot survive without pdev AFAICS.
> > > >
> > > > Are we not concerned about this case?
> > >
> > > Good point.
> > >
> > > whitehat002, any ideas? You found this issue but it really looks like
> > > it is not anything that can ever be hit, so how far do you want to go to
> > > unwind it?
> >
> > I have an idea, sorry for the delay here.
> >
> > I should be ready to post something tomorrow.
>
> Was this ever posted?

No, it wasn't. Sorry for the glacial pace here.

So the idea is based on the observation that the PCI device returned by the current
code in acpi_get_pci_dev() needs to be registered, so if it corresponds to an ACPI
device object, the struct acpi_device representing it must be registered too and,
moreover, it should be the ACPI companion of that PCI device. Thus it should be
sufficient to look for it in the ACPI device object's list of physical nodes
corresponding to it. Hence, the patch below.

I actually can't test it right now (or even compile it for that matter), but
I'll put it in order tomorrow.

---
drivers/acpi/pci_root.c | 82 +++++++++---------------------------------------
1 file changed, 16 insertions(+), 66 deletions(-)

Index: linux-pm/drivers/acpi/pci_root.c
===================================================================
--- linux-pm.orig/drivers/acpi/pci_root.c
+++ linux-pm/drivers/acpi/pci_root.c
@@ -312,76 +312,26 @@ struct acpi_handle_node {
*/
struct pci_dev *acpi_get_pci_dev(acpi_handle handle)
{
- int dev, fn;
- unsigned long long adr;
- acpi_status status;
- acpi_handle phandle;
- struct pci_bus *pbus;
- struct pci_dev *pdev = NULL;
- struct acpi_handle_node *node, *tmp;
- struct acpi_pci_root *root;
- LIST_HEAD(device_list);
-
- /*
- * Walk up the ACPI CA namespace until we reach a PCI root bridge.
- */
- phandle = handle;
- while (!acpi_is_root_bridge(phandle)) {
- node = kzalloc(sizeof(struct acpi_handle_node), GFP_KERNEL);
- if (!node)
- goto out;
-
- INIT_LIST_HEAD(&node->node);
- node->handle = phandle;
- list_add(&node->node, &device_list);
-
- status = acpi_get_parent(phandle, &phandle);
- if (ACPI_FAILURE(status))
- goto out;
- }
-
- root = acpi_pci_find_root(phandle);
- if (!root)
- goto out;
-
- pbus = root->bus;
-
- /*
- * Now, walk back down the PCI device tree until we return to our
- * original handle. Assumes that everything between the PCI root
- * bridge and the device we're looking for must be a P2P bridge.
- */
- list_for_each_entry(node, &device_list, node) {
- acpi_handle hnd = node->handle;
- status = acpi_evaluate_integer(hnd, "_ADR", NULL, &adr);
- if (ACPI_FAILURE(status))
- goto out;
- dev = (adr >> 16) & 0xffff;
- fn = adr & 0xffff;
-
- pdev = pci_get_slot(pbus, PCI_DEVFN(dev, fn));
- if (!pdev || hnd == handle)
- break;
-
- pbus = pdev->subordinate;
- pci_dev_put(pdev);
-
- /*
- * This function may be called for a non-PCI device that has a
- * PCI parent (eg. a disk under a PCI SATA controller). In that
- * case pdev->subordinate will be NULL for the parent.
- */
- if (!pbus) {
- dev_dbg(&pdev->dev, "Not a PCI-to-PCI bridge\n");
- pdev = NULL;
+ struct acpi_device *adev = acpi_fetch_acpi_dev(handle);
+ struct acpi_device_physical_node *pn;
+ struct device *pci_dev = NULL;
+
+ if (!adev)
+ return NULL;
+
+ mutex_lock(&adev->physical_node_lock);
+
+ list_for_each_entry(pn, &acpi_dev->physical_node_list, node) {
+ if (dev_is_pci(pn->dev)) {
+ pci_dev = to_pci_dev(pn->dev);
break;
}
+
}
-out:
- list_for_each_entry_safe(node, tmp, &device_list, node)
- kfree(node);

- return pdev;
+ mutex_unlock(&adev->physical_node_lock);
+
+ return pci_dev;
}
EXPORT_SYMBOL_GPL(acpi_get_pci_dev);