Re: [PATCH 2/2] PCI: mediatek: Add controller support for MT7629

From: Jianjun Wang
Date: Mon Dec 24 2018 - 06:40:39 EST


On Thu, 2018-12-20 at 12:20 -0600, Bjorn Helgaas wrote:
> On Tue, Dec 18, 2018 at 05:19:24PM +0800, Jianjun Wang wrote:
> > On Mon, 2018-12-17 at 15:46 +0000, Lorenzo Pieralisi wrote:
> > > On Mon, Dec 17, 2018 at 08:32:47AM -0600, Bjorn Helgaas wrote:
> > > > On Mon, Dec 17, 2018 at 04:19:39PM +0800, Jianjun Wang wrote:
> > > > > On Thu, 2018-12-13 at 08:55 -0600, Bjorn Helgaas wrote:
> > > > > > On Thu, Dec 06, 2018 at 09:09:13AM +0800, Jianjun Wang wrote:
> > > > > > > The read value of BAR0 is 0xffff_ffff, it's size will be
> > > > > > > calculated as 4GB in arm64 but bogus alignment values at
> > > > > > > arm32, the pcie device and devices behind this bridge will
> > > > > > > not be enabled. Fix it's BAR0 resource size to guarantee
> > > > > > > the pcie devices will be enabled correctly.
> > > > > >
> > > > > > So this is a hardware erratum? Per spec, a memory BAR has
> > > > > > bit 0 hardwired to 0, and an IO BAR has bit 1 hardwired to
> > > > > > 0.
> > > > >
> > > > > Yes, it only works properly on 64bit platform.
> > > >
> > > > I don't understand. BARs are supposed to work the same
> > > > regardless of whether it's a 32- or 64-bit platform. If this is
> > > > a workaround for a hardware defect, please just say that
> > > > explicitly.
> > >
> > > I do not understand this either. First thing to do is to describe
> > > the problem properly so that we can actually find a solution to
> > > it.
> >
> > This BAR0 is a 64-bit memory BAR, the HW default values for this BAR
> > is 0xffff_ffff_0000_0000 and it could not be changed except by
> > config write operation.
>
> If you literally get 0xffff_ffff_0000_0000 when reading the BAR, that
> is out of spec because the low-order 4 bits of a 64-bit memory BAR
> cannot all be zero.
>
> A 64-bit BAR consumes two DWORDS in config space. For a 64-bit BAR0,
> the DWORD at 0x10 contains the low-order bits, and the DWORD at 0x14
> contains the upper 32 bits. Bits 0-3 of the low-order DWORD (the
> one at 0x10) are read-only, and in this case should contain the value
> 0b1100 (0xc). That means the range is prefetchable (bit 3 == 1) and
> the BAR is 64 bits (bits 2:1 == 10).

Sorry, I have confused the HW default value and the read value of BAR
size. The hardware default value is 0xffff_ffff_0000_000c, it's a 64-bit
BAR with prefetchable range.

When we start to decoding the BAR, the read value of BAR0 at 0x10 is
0x0c, and the value at 0x14 is 0xffff_ffff, so the read value of BAR
size is 0xffff_ffff_0000_0000, which will be decoded to 0xffff_ffff, and
it will be set to the end value of BAR0 resource in the pci_dev.
>
> > The calculated BAR size will be 0 in 32-bit platform since the
> > phys_addr_t is a 32bit value in 32-bit platform.
>
> Either (1) this is a hardware defect that feeds incorrect data to the
> BAR size calculation, or (2) there's a problem in the BAR size
> calculation code. We need to figure out which one and work around or
> fix it correctly.

The BAR size is calculated by the code (res->end - res->start + 1) is
fine, I think it's a hardware defect because that we can not change the
hardware default value or just disable it since we don't using it.

>
> > Actually MediaTek's HW does not using this BAR0, just omit it when
> > assign resource is totally fine.
>
> It's totally fine to work around hardware defects, but we have to
> clearly understand the problem so we do it correctly. For example, we
> probably can't just clear out the BAR0 resource in the pci_dev,
> because the BAR in the hardware device still contains a value, and if
> we enable memory decoding for the device, it will still respond to the
> region described by the BAR.

The BAR0 resource value in the pci_dev is depend on the hardware value,
it can be cleared out after all devices have been scanned, but we should
set it's size bigger than MMIO space, so the software will think it's a
invalid resource and won't assign a resource for it.

Thanks.
>
> > When assign the resource for each device, software will check the
> > resource alignment first, and the resource of length zero will be
> > regarded as a bogus alignment resource, it will be ignored and won't
> > claim a resource parent for it.
> >
> > When drivers try to enable the PCIe devices, the software will enable
> > it's resources, but it will return an error number when found a
> > unclaimed resource, in that case, the flow of enable devices will be
> > interrupted and PCIe devices won't work properly.
> >
> > Thanks.
> >