Re: [PATCH v2 3/6] PCI: brcmstb: Add "refusal mode" to preclude PCIe-induced CPU aborts
From: Bjorn Helgaas
Date: Thu Jul 21 2022 - 11:48:21 EST
On Thu, Jul 21, 2022 at 10:53:54AM -0400, Jim Quinlan wrote:
> https://lore.kernel.org/linux-pci/20171215201434.GY30595@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
> On Wed, Jul 20, 2022 at 6:06 PM Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
> > On Sat, Jul 16, 2022 at 06:24:50PM -0400, Jim Quinlan wrote:
> > > Our PCIe RC HW has an atypical behavior: if it does not have PCIe link
> > > established between itself and downstream, any subsequent config space
> > > access causes a CPU abort. This commit sets a "refusal mode" if the PCIe
> > > link-up fails, and this has our pci_ops map_bus function returning a NULL
> > > address, which in turn precludes the access from happening.
> > >
> > > Right now, "refusal mode" is window dressing. It will become relevant
> > > in a future commit when brcm_pcie_start_link() is invoked during
> > > enumeration instead of before it.
> > >
> > > Signed-off-by: Jim Quinlan <jim2101024@xxxxxxxxx>
> > > ---
> > > drivers/pci/controller/pcie-brcmstb.c | 24 ++++++++++++++++++++++++
> > > 1 file changed, 24 insertions(+)
> > >
> > > diff --git a/drivers/pci/controller/pcie-brcmstb.c b/drivers/pci/controller/pcie-brcmstb.c
> > > index c026446d5830..72219a4f3964 100644
> > > --- a/drivers/pci/controller/pcie-brcmstb.c
> > > +++ b/drivers/pci/controller/pcie-brcmstb.c
> > > @@ -255,6 +255,7 @@ struct brcm_pcie {
> > > u32 hw_rev;
> > > void (*perst_set)(struct brcm_pcie *pcie, u32 val);
> > > void (*bridge_sw_init_set)(struct brcm_pcie *pcie, u32 val);
> > > + bool refusal_mode;
> > > };
> > >
> > > static inline bool is_bmips(const struct brcm_pcie *pcie)
> > > @@ -687,6 +688,19 @@ static void __iomem *brcm_pcie_map_conf(struct pci_bus *bus, unsigned int devfn,
> > > if (pci_is_root_bus(bus))
> > > return PCI_SLOT(devfn) ? NULL : base + where;
> > >
> > > + if (pcie->refusal_mode) {
> > > + /*
> > > + * At this point we do not have PCIe link-up. If there is
> > > + * a config read or write access besides those targeting
> > > + * the host bridge, our PCIe HW throws a CPU abort. To
> > > + * prevent this we return the NULL address. The calling
> > > + * functions -- pci_generic_config_*() -- will notice this
> > > + * and not perform the access, and if it is a read access,
> > > + * 0xffffffff is returned.
> > > + */
> > > + return NULL;
> > > + }
> >
> > Is this any different from all the other .map_bus() implementations
> > that return NULL when the link is down?
>
> Not really, but long ago I submitted code that gated the config spec
> access based on link status and was advised not to do it [1].
> I'll be happy to make it look like the others.
>
> [1] https://lore.kernel.org/linux-pci/20171215201434.GY30595@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
My point there was that if you can deal with the abort cleanly, that's
the best approach. Apparently brcmstb can't recover cleanly, so you
have to settle for the 99% solution.
The refusal_mode approach has the same race as checking
*_pcie_link_up(), since the link may go down between the time
brcm_pcie_start_link() sees that it is up and the time somebody does a
config access:
brcm_pcie_start_link
pcie->refusal_mode = false
<link goes down>
brcm_pcie_map_conf
if (pcie->refusal_mode) # still false
<config access causes abort>
So there's no advantage in making the code look different. Checking
for link-up in the config access path can never completely remove the
window, but it does make it smaller than using refusal_mode.
Bjorn