Re: [PATCH v2 1/1] PCI: brcmstb: Fix regression regarding missing PCIe linkup

From: Jim Quinlan
Date: Sun May 29 2022 - 12:52:45 EST


On Sat, May 28, 2022 at 6:44 PM Jim Quinlan <jim2101024@xxxxxxxxx> wrote:
>
> commit 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice voltage regulators")

> Signed-off-by: Jim Quinlan <jim2101024@xxxxxxxxx>
> introduced a regression on the PCIe RPi4 Compute Module. If the PCIe
> root port DT node described in [2] was missing, no linkup would be attempted,
> and subsequent accesses would cause a panic because this particular PCIe HW
> causes a CPU abort on illegal accesses (instead of returning 0xffffffff).
>
> We fix this by allowing the DT root port node to be missing, as it behaved
> before the original patchset messed things up.
>
> In addition, two small changes are made:
>
> 1. Having pci_subdev_regulators_remove_bus() call
> regulator_bulk_free() in addtion to regulator_bulk_disable().
> 2. Having brcm_pcie_add_bus() return 0 if there is an
> error in calling pci_subdev_regulators_add_bus().
> Instead, we dev_err() and turn on our refusal mode instead.
>
> It would be best if this commit were tested by someone with a Rpi CM4
> platform, as that is how the regression was found. I have only emulated
> the problem and fix on different platform.
>
> Note that a bisection identified
>
> commit 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs")
>
> as the first failing commit. This commit is a regression, but is unrelated
> and was fixed by a subsequent commit in the original patchset.
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=215925
> [2] Documentation/devicetree/bindings/pci/brcm,stb-pcie.yaml
>
> Fixes: 93e41f3fca3d ("PCI: brcmstb: Add control of subdevice voltage regulators")
> Fixes: 830aa6f29f07 ("PCI: brcmstb: Split brcm_pcie_setup() into two funcs")
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=215925
Thorston -- I forgot to replace the bugzilla link; I'll get it on V3. -- Jim

> Signed-off-by: Jim Quinlan <jim2101024@xxxxxxxxx>
> ---
> drivers/pci/controller/pcie-brcmstb.c | 43 +++++++++++++++++++--------
> 1 file changed, 30 insertions(+), 13 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> index ba5c120816b2..0839325f79ab 100644
> --- a/drivers/pci/controller/pcie-brcmstb.c
> +++ b/drivers/pci/controller/pcie-brcmstb.c
> @@ -540,29 +540,42 @@ static int pci_subdev_regulators_add_bus(struct pci_bus *bus)
>
> static int brcm_pcie_add_bus(struct pci_bus *bus)
> {
> - struct device *dev = &bus->dev;
> - struct brcm_pcie *pcie = (struct brcm_pcie *) bus->sysdata;
> + struct brcm_pcie *pcie;
> int ret;
>
> - if (!dev->of_node || !bus->parent || !pci_is_root_bus(bus->parent))
> + /*
> + * Right now we only alloc/enable regulators and initiate pcie link
> + * when under the root port bus of the current domain. In the
> + * future we may want to alloc/enable regulators under any port
> + * device (e.g. a switch).
> + */
> + if (!bus->parent || !pci_is_root_bus(bus->parent))
> return 0;
>
> ret = pci_subdev_regulators_add_bus(bus);
> - if (ret)
> - return ret;
> + if (ret) {
> + dev_err(pcie->dev, "failed to alloc/enable regulators\n");
> + goto err;
> + }
>
> - /* Grab the regulators for suspend/resume */
> + /* Save the regulators for RC suspend/resume */
> + pcie = (struct brcm_pcie *) bus->sysdata;
> pcie->sr = bus->dev.driver_data;
>
> + /* Attempt PCIe link-up */
> + if (brcm_pcie_linkup(pcie) == 0)
> + return 0;
> +err:
> /*
> - * If we have failed linkup there is no point to return an error as
> - * currently it will cause a WARNING() from pci_alloc_child_bus().
> - * We return 0 and turn on the "refusal_mode" so that any further
> - * accesses to the pci_dev just get 0xffffffff
> + * If we have failed linkup or have an error when turning on
> + * regulators, there is no point to return an error value to the
> + * caller (pci_alloc_child_bus()) as it will summarily execute a
> + * WARNING(). Instead, we turn on our "refusal_mode" and return 0
> + * so that any further PCIe accesses succeed but do nothing (reads
> + * return 0xffffffff). If we do not turn on refusal mode, our
> + * unforgiving PCIe HW will signal a CPU abort.
> */
> - if (brcm_pcie_linkup(pcie) != 0)
> - pcie->refusal_mode = true;
> -
> + pcie->refusal_mode = true;
> return 0;
> }
>
> @@ -570,13 +583,17 @@ static void pci_subdev_regulators_remove_bus(struct pci_bus *bus)
> {
> struct device *dev = &bus->dev;
> struct subdev_regulators *sr = dev->driver_data;
> + struct brcm_pcie *pcie;
>
> if (!sr || !bus->parent || !pci_is_root_bus(bus->parent))
> return;
>
> if (regulator_bulk_disable(sr->num_supplies, sr->supplies))
> dev_err(dev, "failed to disable regulators for downstream device\n");
> + regulator_bulk_free(sr->num_supplies, sr->supplies);
> dev->driver_data = NULL;
> + pcie = (struct brcm_pcie *) bus->sysdata;
> + pcie->sr = NULL;
> }
>
> /* Limits operation to a specific generation (1, 2, or 3) */
> --
> 2.17.1
>