Re: [PATCH 1/2] PCI: Setup bridge resources earlier

From: Val Packett

Date: Thu Oct 09 2025 - 03:30:00 EST


On 10/7/25 12:43 PM, Ilpo Järvinen wrote:

On Mon, 6 Oct 2025, Val Packett wrote:
[..]

I think it's that early check in pci_read_bridge_bases that avoids the setup
here:

    if (pci_is_root_bus(child)) /* It's a host bus, nothing to read */
        return;
If there's a PCI device as is the case in pci_read_bridge_windows()
which inputs non-NULL pci_dev, the config space of that device can be read
normally (or should be readable normally, AFAIK). The case where bus->self
is NULL is different, we can't read from a non-existing PCI device, but
it doesn't apply to pci_read_bridge_windows().

I don't think reading the window is the real issue here but how the
resource fitting algorithm corners itself by reserving space for bridge
windows before it knows their sizes, so basically these lines from the
earlier email:

pci 0004:00:00.0: bridge window [mem 0x7c300000-0x7c3fffff]: assigned
pci 0004:00:00.0: bridge window [mem 0x7c400000-0x7c4fffff 64bit pref]: assigned
pci 0004:00:00.0: BAR 0 [mem 0x7c500000-0x7c500fff]: assigned

...which seem to occur before the child buses have been scanned so that
space reserved is either hotplug reservation or due to "old_size" lower
bounding. That non-prefetchable bridge window is too small to fit the
child resources.

Could you try passing pci=hpmemsize=0M to kernel command line if that
helps?

The other case is the "old_size" in calculate_memsize() which too can
cause the same effect preventing sizing bridge window truly to zero when
it's not needed (== disable it == not assign it at all at that point).
Forcing it to zero would perhaps be worth a test (or removing the max()
related to old_size)

I've no idea why the old_size should decide anything, I hate that black
magic but I've just not dared to remove it (it's hard to know why some
things were made in the past, there could have been some HW issue worked
around by such odd feature but it's so old code that there isn't any real
information about whys anymore to find).
Well, you did dare to mess with resource assignment sequence, and it got very quickly and quietly merged into linux-next causing a big regression on hardware that's not made by your company.. so maybe it's better not to touch anything there at all (:
pci=realloc on command line might help too, but I'm not sure. There seems
to be some extra space within the root bus resource so it might work.

I'm not sure what call chain is causing the assignment of those 3 bridge
windows. One easy way to find out where it comes from would be to put
WARN_ON(res->start == 0x7c400000); into pci_assign_resource() next to the
line which prints "...: assigned".

OK, I've uploaded the full big chungus logs (all with the WARN_ON):

https://owo.packett.cool/lin/pcifail.reverted.dmesg
https://owo.packett.cool/lin/pcifail.noarg.dmesg
https://owo.packett.cool/lin/pcifail.hpmeme.dmesg (hpmemsize didn't help)
https://owo.packett.cool/lin/pcifail.realloc.dmesg (realloc didn't help either)

So without your change, the assignment first comes from pci_rescan_bus → pci_assign_unassigned_bus_resources *via IRQ*, and then in the probe of the wifi driver.

~val