Re: [PATCH 2/3] PCI: iproc: Stop using generic config read/write functions

From: Bjorn Helgaas
Date: Thu Jul 30 2020 - 12:45:05 EST


On Thu, Jul 30, 2020 at 09:36:14AM -0700, Ray Jui wrote:
> On 7/30/2020 9:09 AM, Bjorn Helgaas wrote:
> > On Thu, Jul 30, 2020 at 03:37:46PM +1200, Mark Tomlinson wrote:
> >> The pci_generic_config_write32() function will give warning messages
> >> whenever writing less than 4 bytes at a time. As there is nothing we can
> >> do about this without changing the hardware, the message is just a
> >> nuisance. So instead of using the generic functions, use the functions
> >> that have already been written for reading/writing the config registers.
> >
> > The reason that pci_generic_config_write32() message is there is
> > because, as the message says, a read/modify/write may corrupt bits in
> > adjacent registers.
> >
> > It makes me a little queasy to do these read/modify/write sequences
> > silently. A generic driver doing an 8- or 16-bit config write has no
> > idea that the write may corrupt an adjacent register. That leads to
> > bugs that are very difficult to debug and only reproducible on iProc.
> >
> > The ratelimiting on the current pci_generic_config_write32() message
> > is based on the call site, not on the device. That's not ideal: we
> > may emit several messages for device A, trigger ratelimiting, then do
> > a write for device B that doesn't generate a message.
> >
> > I think it would be better to have a warning once per device, so if
> > XYZ device has a problem and we look at the dmesg log, we would find a
> > single message for device XYZ as a hint. Would that reduce the
> > nuisance level enough?
>
> I'm in favor of this. I agree with you that we do need the warnings
> because some PCIe config registers that are read/write to clear.
>
> But the current amount of warning messages generated from these config
> register access is quite massive and often concerns the users who are
> less familiar with the reason/purpose of the warnings. We were asked
> about these warnings by multiple customers. People freaked out when they
> see "corrupt" in the warning messages, :)

Yeah, I'm sure they would. Hopefully the message makes it all the way
back to the hardware designers ;)

> Limiting the warning to once per device seems to be a reasonable
> compromise to me.

We (you, I mean :)) could also look at the particular warnings. If
they're triggered by PCI core writes that are 8- or 16-bits when they
*could* be 32-bits, it might make sense to widen them. I know there
are places that do 8-bit writes to 16-bit registers; maybe there are
similar ones to 32-bit registers.

Bjorn