Re: [PATCH 24/25] PCI: Perform reset_resource() and build fail list in sync
From: Ilpo Järvinen
Date: Tue Jun 03 2025 - 04:14:08 EST
On Mon, 2 Jun 2025, Tudor Ambarus wrote:
> On 6/2/25 4:08 PM, Ilpo Järvinen wrote:
> >>> I think I figured out more about the reason. It's not related to that
> >>> bridge window resource.
> >>>
> >>> pbus_size_mem() will add also that ROM resource into realloc_head
> >>> as it is considered (intentionally) optional after the optional change
> >>> (as per "tudor: 2:" line). And that resource is never assigned because
>
> cut
>
> >>> pdev_sort_resources() didn't pick it up into the head list. The next
> >>> question is why the ROM resource isn't in the head list.
> >>>
> >> It seems the ROM resource is skipped at:
> >> https://web.git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> >> next.git/tree/drivers/pci/setup-bus.c#n175
> >>
> >> tudor: pdev_sort_resources: ROM [??? 0x00000000 flags 0x0] resource
> >> skipped due to !(r->flags) || r->parent
> > I don't see the device in this print, hope it is for the same device.
> >
> > In any case, I don't understand what reset resource's flags in between
> > pbus_size_mem() and pdev_sort_resources(), or alternative, why type
> > checking in pbus_size_mem() matches if flags == 0 at that point.
> >
> > Those two functions should work on the same resources, if one skips
> > something, the other should too. Disparity between them can cause issues,
> > but despite reading the code multiple times, I couldn't figure out how
> > that disparity occurs (except for the !pdev_resources_assignable() case).
>
> cut
>
> > It is of interest to know why the same resource is treated differently.
> > So what were the resource flags, type* args when it's processed by
> > pbus_size_mem()? If resource's flags are zero at that point but it matches
>
> This is the full output: https://termbin.com/mn1x
> for the following prints: https://termbin.com/q57h
>
> It seems ROM resource is of type 2 at pbus_size_mem() time.
>
> > one of the types, that would be a bug.
>
> I'll give another try tomorrow. Thanks,
Those are not the same device, so not the same resource either:
[ 16.262745][ T1113] pci 0001:01:00.0: tudor: 2: pbus_size_mem: ROM [mem 0x00000000-0x0000ffff pref] list empty? 0
[ 16.267736][ T1113] pcieport 0001:00:00.0: tudor: pdev_sort_resources: ROM [??? 0x00000000 flags 0x0] resource skipped due to !(r->flags) || r->parent
0001:01:00.0
vs
0001:00:00.0
And the resources for 0001:01:00.0 were never processed by
__pci_bus_assign_resources(). But __pci_bus_assign_resources() should
recurse to subordinate busses.
And, it seems this boils down to the inconsistency I noticed earlier:
[ 16.253464][ T1113] pci 0001:01:00.0: [144d:a5a5] type 00 class 0x000000 PCIe Endpoint
include/linux/pci_ids.h:#define PCI_CLASS_NOT_DEFINED 0x0000
pdev_resources_assignable() checks if class == PCI_CLASS_NOT_DEFINED and
pdev_sort_resources() bails out without processing those resources.
So please test if this patch solves your problem:
From 00e440f505a79568cf5203dce265488cb3f66941 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ilpo=20J=C3=A4rvinen?= <ilpo.jarvinen@xxxxxxxxxxxxxxx>
Date: Tue, 3 Jun 2025 11:07:38 +0300
Subject: [PATCH 1/1] PCI: Fix pdev_resources_assignable() disparity
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
pdev_sort_resources() uses pdev_resources_assignable() helper to decide
if device's resources cannot be assigned. pbus_size_mem(), on the other
hand, does not do the same check. This could lead into a situation
where a resource ends up on realloc_head list but is not on the head
list, which is turn prevents emptying the resource from the
realloc_head list in __assign_resources_sorted().
A non-empty realloc_head is unacceptable because it triggers an
internal sanity check as show in this log with a device that has class
0 (PCI_CLASS_NOT_DEFINED):
pci 0001:01:00.0: [144d:a5a5] type 00 class 0x000000 PCIe Endpoint
pci 0001:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit]
pci 0001:01:00.0: ROM [mem 0x00000000-0x0000ffff pref]
pci 0001:01:00.0: enabling Extended Tags
pci 0001:01:00.0: PME# supported from D0 D3hot D3cold
pci 0001:01:00.0: 15.752 Gb/s available PCIe bandwidth, limited by 8.0 GT/s PCIe x2 link at 0001:00:00.0 (capable of 31.506 Gb/s with 16.0 GT/s PCIe x2 link)
pcieport 0001:00:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 01-ff] add_size 100000 add_align 100000
pcieport 0001:00:00.0: bridge window [mem 0x40000000-0x401fffff]: assigned
------------[ cut here ]------------
kernel BUG at drivers/pci/setup-bus.c:2532!
Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
...
Call trace:
pci_assign_unassigned_bus_resources+0x110/0x114 (P)
pci_rescan_bus+0x28/0x48
Use pdev_resources_assignable() also within pbus_size_mem() to skip
processing of non-assignable resources which removes the disparity in
between what resources pdev_sort_resources() and pbus_size_mem()
consider. As non-assignable resources are no longer processed, they are
not added to the realloc_head list, thus the sanity check no longer
triggers.
This disparity problem is very old but only now became apparent after
the commit 2499f5348431 ("PCI: Rework optional resource handling") that
made the ROM resources optional when calculating bridge window sizes
which required adding the resource to the realloc_head list.
Previously, bridge windows were just sized larger than necessary.
Fixes: 2499f5348431 ("PCI: Rework optional resource handling")
Cc: stable@xxxxxxxxxxxxxxx
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxxxxxx>
---
drivers/pci/setup-bus.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 54d6f4fa3ce1..da084251df43 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1187,6 +1187,7 @@ static int pbus_size_mem(struct pci_bus *bus, unsigned long mask,
resource_size_t r_size;
if (r->parent || (r->flags & IORESOURCE_PCI_FIXED) ||
+ !pdev_resources_assignable(dev) ||
((r->flags & mask) != type &&
(r->flags & mask) != type2 &&
(r->flags & mask) != type3))
base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8
--
2.39.5