Re: [PATCH 1/1] of/pci: Add IORESOURCE_MEM_64 to resource flags for 64-bit memory addresses

From: Leonardo Bras
Date: Mon Apr 19 2021 - 22:03:11 EST


On Mon, 2021-04-19 at 20:39 -0500, Rob Herring wrote:
> On Mon, Apr 19, 2021 at 7:35 PM Leonardo Bras <leobras.c@xxxxxxxxx> wrote:
> >
> > On Mon, 2021-04-19 at 10:44 -0500, Rob Herring wrote:
> > > On Fri, Apr 16, 2021 at 3:58 PM Leonardo Bras <leobras.c@xxxxxxxxx> wrote:
> > > >
> > > > Hello Rob, thanks for this feedback!
> > > >
> > > > On Thu, 2021-04-15 at 13:59 -0500, Rob Herring wrote:
> > > > > +PPC and PCI lists
> > > > >
> > > > > On Thu, Apr 15, 2021 at 1:01 PM Leonardo Bras <leobras.c@xxxxxxxxx> wrote:
> > > > > >
> > > > > > Many other resource flag parsers already add this flag when the input
> > > > > > has bits 24 & 25 set, so update this one to do the same.
> > > > >
> > > > > Many others? Looks like sparc and powerpc to me.
> > > > >
> > > >
> > > > s390 also does that, but it look like it comes from a device-tree.
> > >
> > > I'm only looking at DT based platforms, and s390 doesn't use DT.
> >
> > Correct.
> > Sorry, I somehow write above the opposite of what I was thinking.
> >
> > >
> > > > > Those would be the
> > > > > ones I worry about breaking. Sparc doesn't use of/address.c so it's
> > > > > fine. Powerpc version of the flags code was only fixed in 2019, so I
> > > > > don't think powerpc will care either.
> > > >
> > > > In powerpc I reach this function with this stack, while configuring a
> > > > virtio-net device for a qemu/KVM pseries guest:
> > > >
> > > > pci_process_bridge_OF_ranges+0xac/0x2d4
> > > > pSeries_discover_phbs+0xc4/0x158
> > > > discover_phbs+0x40/0x60
> > > > do_one_initcall+0x60/0x2d0
> > > > kernel_init_freeable+0x308/0x3a8
> > > > kernel_init+0x2c/0x168
> > > > ret_from_kernel_thread+0x5c/0x70
> > > >
> > > > For this, both MMIO32 and MMIO64 resources will have flags 0x200.
> > >
> > > Oh good, powerpc has 2 possible flags parsing functions. So in the
> > > above path, do we need to set PCI_BASE_ADDRESS_MEM_TYPE_64?
> > >
> > > Does pci_parse_of_flags() get called in your case?
> > >
> >
> > It's called in some cases, but not for the device I am debugging
> > (virtio-net pci@800000020000000).
> >
> > For the above device, here is an expanded stack trace:
> >
> > of_bus_pci_get_flags() (from parser->bus->get_flags())
> > of_pci_range_parser_one() (from macro for_each_of_pci_range)
> > pci_process_bridge_OF_ranges+0xac/0x2d4
> > pSeries_discover_phbs+0xc4/0x158
> > discover_phbs+0x40/0x60
> > do_one_initcall+0x60/0x2d0
> > kernel_init_freeable+0x308/0x3a8
> > kernel_init+0x2c/0x168
> > ret_from_kernel_thread+0x5c/0x70
> >
> > For other devices, I could also see the following stack trace:
> > ## device ethernet@8
> >
> > pci_parse_of_flags()
> > of_create_pci_dev+0x7f0/0xa40
> > __of_scan_bus+0x248/0x320
> > pcibios_scan_phb+0x370/0x3b0
> > pcibios_init+0x8c/0x12c
> > do_one_initcall+0x60/0x2d0
> > kernel_init_freeable+0x308/0x3a8
> > kernel_init+0x2c/0x168
> > ret_from_kernel_thread+0x5c/0x70
> >
> > Devices that get parsed with of_bus_pci_get_flags() appears first at
> > dmesg (around 0.015s in my test), while devices that get parsed by
> > pci_parse_of_flags() appears later (0.025s in my test).
> >
> > I am not really used to this code, but having the term "discover phbs"
> > in the first trace and the term "scan phb" in the second, makes me
> > wonder if the first trace is seen on devices that are seen/described in
> > the device-tree and the second trace is seen in devices not present in
> > the device-tree and found scanning pci bus.
>
> That was my guess as well. I think on pSeries that most PCI devices
> are in the DT whereas on Arm and other flattened DT (non OpenFirmware)
> platforms PCI devices are not in DT.
>

It makes sense to me. 

> Of course, for virtio devices,
> they would not be in DT in either case.

I don't get this part... in pseries it looks like virtio devices can be
in device-tree.

Oh, I think I get it... this pci@800000020000000 looks like a bus
(described in device-tree, so discovered), and then the devices are
inside it, getting scanned.

The virtio device gets the correct flags (from pci_parse_of_flags), but
the bus (pci@800000020000000) does not seem to get it correctly,
because it comes from of_bus_pci_get_flags() which makes sense
according to the name of the function.

(see lspci bellow, output without patch)


00:08.0 Ethernet controller: Red Hat, Inc. Virtio network device (rev
01)
Subsystem: Red Hat, Inc. Device 1100
Device tree node:
/sys/firmware/devicetree/base/pci@800000020000000/ethernet@8
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 19
IOMMU group: 0
Region 1: Memory at 200080020000 (32-bit, non-prefetchable)
[size=4K]
Region 4: Memory at 210000010000 (64-bit, prefetchable)
[size=16K]
Expansion ROM at 200080040000 [disabled] [size=256K]
Capabilities: [98] MSI-X: Enable+ Count=3 Masked-
Vector table: BAR=1 offset=00000000
PBA: BAR=1 offset=00000800
Capabilities: [84] Vendor Specific Information: VirtIO:
<unknown>
BAR=0 offset=00000000 size=00000000
Capabilities: [70] Vendor Specific Information: VirtIO: Notify
BAR=4 offset=00003000 size=00001000 multiplier=00000004
Capabilities: [60] Vendor Specific Information: VirtIO:
DeviceCfg
BAR=4 offset=00002000 size=00001000
Capabilities: [50] Vendor Specific Information: VirtIO: ISR
BAR=4 offset=00001000 size=00001000
Capabilities: [40] Vendor Specific Information: VirtIO:
CommonCfg
BAR=4 offset=00000000 size=00001000
Kernel driver in use: virtio-pci


>
> > > > > I noticed both sparc and powerpc set PCI_BASE_ADDRESS_MEM_TYPE_64 in
> > > > > the flags. AFAICT, that's not set anywhere outside of arch code. So
> > > > > never for riscv, arm and arm64 at least. That leads me to
> > > > > pci_std_update_resource() which is where the PCI code sets BARs and
> > > > > just copies the flags in PCI_BASE_ADDRESS_MEM_MASK ignoring
> > > > > IORESOURCE_* flags. So it seems like 64-bit is still not handled and
> > > > > neither is prefetch.
> > > > >
> > > >
> > > > I am not sure if you mean here:
> > > > a) it's ok to add IORESOURCE_MEM_64 here, because it does not affect
> > > > anything else, or
> > > > b) it should be using PCI_BASE_ADDRESS_MEM_TYPE_64
> > > > (or IORESOURCE_MEM_64 | PCI_BASE_ADDRESS_MEM_TYPE_64) instead, since
> > > > it's how it's added in powerpc/sparc, and else there is no point.
> > >
> > > I'm wondering if a) is incomplete and PCI_BASE_ADDRESS_MEM_TYPE_64
> > > also needs to be set. The question is ultimately are BARs getting set
> > > correctly for 64-bit? It looks to me like they aren't.
> >
> > I am not used to these terms, does BAR means 'Base Address Register'?
>
> Yes. Standard PCI thing.

Nice :)

>
> > If so, those are the addresses stored in pci->phb->mem_resources[i] and
> > pci->phb->mem_offset[i], printed from enable_ddw() (which takes place a
> > lot after discovering the device (0.17s in my run)).
> >
> > resource #1 pci@800000020000000: start=0x200080000000
> > end=0x2000ffffffff flags=0x200 desc=0x0 offset=0x200000000000
> > resource #2 pci@800000020000000: start=0x210000000000
> > end=0x21ffffffffff flags=0x200 desc=0x0 offset=0x0
> >
> > The message above was printed without this patch.
> > With the patch, the flags for memory resource #2 gets ORed with
> > 0x00100000.
>
> Right, as expected.
>
> > Is it enough to know if BARs are correctly set for 64-bit?
>
> No, because AFAICT, bit 2 in the BAR would not be set.
>
> > If it's not, how can I check?
>
> Can you try 'lspci -vv' and look at the 'Region X:' lines which will
> say 32 or 64-bit. I *think* that should reflect what actually got
> written into the BARs.

As seen in the lspci from above comment:
Region 1: Memory at 200080020000 (32-bit, non-prefetchable) [size=4K]
Region 4: Memory at 210000010000 (64-bit, prefetchable) [size=16K]

So it seems to be getting configured properly.

I think the point here is bus resources not getting the MEM_64 flag,
but device resources getting it correctly. Is that supposed to happen?

>
> Rob

Thanks Rob!

Leonardo Bras