Re: [PATCH] pci: avoid bridge feature re-probing on hotplug

From: Michael S. Tsirkin
Date: Tue Dec 11 2018 - 14:01:19 EST


On Tue, Dec 11, 2018 at 08:18:08AM -0600, Bjorn Helgaas wrote:
> Hi Michael,
>
> Please run "git log --oneline drivers/pci/setup-bus.c" and follow
> the usual style.
>
> On Mon, Dec 10, 2018 at 09:18:40PM -0500, Michael S. Tsirkin wrote:
> > commit 1f82de10d6 ("PCI/x86: don't assume prefetchable ranges are
> > 64bit") added probing of bridge support for 64 bit memory
> > each time bridge is re-enumerated.
>
> Use conventional SHA1 reference (12-char SHA1).
>
> > Unfortunately this probing is destructive if any device behind
> > the bridge is in use at this time.
>
> Agreed, this sounds like a problem.
>
> > There's no real need to re-probe the bridge features as the
> > regiters in question never change - detect that using
> > the memory flag being set and skip the probing.
>
> s/regiters/registers/


Will address above.

> > Avoiding repeated calls to pci_bridge_check_ranges might be even nicer
> > would be a bigger patch and probably not appropriate on stable.
>
> Maybe so. The ideal thing might be to have a trivial patch like this
> that can be marked for stable, immediately followed by the nicer
> patch. Trivial band-aids tend to accumulate and make things harder in
> the future.

I understand, and I looked at it briefly, but it's not a simple
change, with probing taking detours through acpi etc.

I plan to look at it some more but should we release another linux
with this bug?

> I'd have to take a much harder look at the problem to understand
> 1f82de10d6b1. The comment about "double check" seems misleading -- as
> you say, the hardware doesn't change and checking once should be
> enough. And if we're calling pci_bridge_check_ranges() more than
> necessary, that sounds like a problem, too.

So that will kind of make it a non issue. Should we still worry?

> > Reported-by: xuyandong <xuyandong2@xxxxxxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> > Cc: Yinghai Lu <yinghai@xxxxxxxxxx>
> > Cc: Jesse Barnes <jbarnes@xxxxxxxxxxxxxxxx>
> > Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
> > ---
> >
> > This issue has been reported on upstream Linux and Centos.
>
> Are there URLs to these reports that we could include in the changelog?

https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg01711.html

and specifically

https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02082.html


> > drivers/pci/setup-bus.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
> > index ed960436df5e..7ab42f76579e 100644
> > --- a/drivers/pci/setup-bus.c
> > +++ b/drivers/pci/setup-bus.c
> > @@ -741,6 +741,13 @@ static void pci_bridge_check_ranges(struct pci_bus *bus)
> > struct resource *b_res;
> >
> > b_res = &bridge->resource[PCI_BRIDGE_RESOURCES];
> > +
> > + /* Don't re-check after this was called once already:
> > + * important since bridge might be in use.
> > + */
> > + if (b_res[1].flags & IORESOURCE_MEM)
> > + return;
>
> Use conventional multi-line comment style.
>
> This test isn't 100%: devices below the bridge could be using only IO,
> or theoretically could be even using just config space.
>
> If it's safe to bail out if the bridge is in use, why isn't it safe to
> bail out *always*?
>
> > b_res[1].flags |= IORESOURCE_MEM;
> >
> > pci_read_config_word(bridge, PCI_IO_BASE, &io);
> > --
> > MST