Re: Boot failure on gru-scarlet-inx with 5.9-rc2

From: Marc Zyngier
Date: Thu Sep 03 2020 - 11:59:40 EST


On 2020-09-03 15:35, Rob Herring wrote:
On Thu, Sep 3, 2020 at 3:19 AM Lorenzo Pieralisi
<lorenzo.pieralisi@xxxxxxx> wrote:

On Wed, Sep 02, 2020 at 11:47:56PM -0400, Samuel Dionne-Riel wrote:
> On Wed, 2 Sep 2020 17:01:19 +0100
> Lorenzo Pieralisi <lorenzo.pieralisi@xxxxxxx> wrote:
>
> > On Tue, Sep 01, 2020 at 02:33:56PM -0400, Samuel Dionne-Riel wrote:
> >
> > Please print a pointer as a pointer and print both bus and
> > bus->parent.
>
> Hopefully pointer as a pointer is %px. Not sure what else, if that's
> wrong please tell.
>
> ---
> @@ -79,6 +79,8 @@ static int rockchip_pcie_valid_device(struct rockchip_pcie *rockchip,
> * do not read more than one device on the bus directly attached
> * to RC's downstream side.
> */
> + printk("[!!] // bus (%px) bus->parent (%px)\n", bus, bus->parent);
> + printk("[!!] bus->primary (%d) == rockchip->root_bus_nr (%d) && dev (%d) > 0\n", bus->primary, rockchip->root_bus_nr, dev);
> if (bus->primary == rockchip->root_bus_nr && dev > 0)
> return 0;
>
> --
>
> Again, two values, verified with a bit of set and `sort -u`.
>
> [ 1.691266] [!!] // bus (ffff0000ef9ab800) bus->parent (0000000000000000)
> [ 1.691271] [!!] bus->primary (0) == rockchip->root_bus_nr (0) && dev (0) > 0
>
> and
>
> [ 1.697156] [!!] // bus (ffff0000ef9ac000) bus->parent (ffff0000ef9ab800)
> [ 1.697160] [!!] bus->primary (0) == rockchip->root_bus_nr (0) && dev (0) > 0
>
> First instance of each shown here. Last time I don't think it was.

Ok I think I understand what the problem is.

Can you give this patch a shot please ? I think we are dereferencing
a NULL pointer if bus is the root bus and dev == 0, we can rewrite
the check if this patch fixes the issue.

Indeed. I checked all the other cases of pci_is_root_bus(bus->parent)
and they should be fine because they are only reached if !root_bus.

I would restructure the check like this instead:

diff --git a/drivers/pci/controller/pcie-rockchip-host.c
b/drivers/pci/controller/pcie-rockchip-host.c
index 0bb2fb3e8a0b..9b485bea8b92 100644
--- a/drivers/pci/controller/pcie-rockchip-host.c
+++ b/drivers/pci/controller/pcie-rockchip-host.c
@@ -72,14 +72,14 @@ static int rockchip_pcie_valid_device(struct
rockchip_pcie *rockchip,
struct pci_bus *bus, int dev)
{
/* access only one slot on each root port */
- if (pci_is_root_bus(bus) && dev > 0)
- return 0;
-
- /*
- * do not read more than one device on the bus directly attached
- * to RC's downstream side.
- */
- if (pci_is_root_bus(bus->parent) && dev > 0)
+ if (pci_is_root_bus(bus))
+ if (dev > 0)
+ return 0;
+ else if (pci_is_root_bus(bus->parent) && dev > 0)

Careful here, this else is relative to the *closest* if,
and not what the indentation suggests...

+ /*
+ * do not read more than one device on the bus directly attached
+ * to RC's downstream side.
+ */
return 0;

return 1;


M.
--
Jazz is not dead. It just smells funny...