Re: [PATCH 4/4] PCI/sysfs: Allow userspace to query and set device reset mechanism

From: Alex Williamson
Date: Fri Mar 19 2021 - 12:24:02 EST


On Fri, 19 Mar 2021 14:59:47 +0200
Leon Romanovsky <leon@xxxxxxxxxx> wrote:

> On Thu, Mar 18, 2021 at 07:34:56PM +0100, Enrico Weigelt, metux IT consult wrote:
> > On 18.03.21 18:22, Leon Romanovsky wrote:
> >
> > > Which email client do you use?
> > > Your responses are grouped as one huge block without any chance to respond
> > > to you on specific point or answer to your question.
> >
> > I'm reading this thread in Tbird, and threading / quoting all looks
> > nice.
>
> I'm not talking about threading or quoting but about response itself.
> See it here https://lore.kernel.org/lkml/20210318103935.2ec32302@xxxxxxxxxxxxxxxxxxxxx/
> Alex's response is one big chunk without any separations to paragraphs.

I've never known paragraph breaks to be required to interject a reply.

Back on topic...

> >
> > > I see your flow and understand your position, but will repeat my
> > > position. We need to make sure that vendors will have incentive to
> > > supply quirks.

What if we taint the kernel or pci_warn() for cases where either all
the reset methods are disabled, ie. 'echo none > reset_method', or any
time a device specific method is disabled?

I'd almost go so far as to prevent disabling a device specific reset
altogether, but for example should a device specific reset that fixes
an aspect of FLR behavior prevent using a bus reset? I'd prefer in that
case if direct FLR were disabled via a device flag introduced with the
quirk and the remaining resets can still be selected by preference.

Theoretically all the other reset methods work and are available, it's
only a policy decision which to use, right?

If a device probes for a reset that's broken and distros start
including systemd scripts to apply a preference to avoid it, (a) that
enables them to work with existing kernels, and (b) indicates to us to
add the trivial quirk to flag that reset as broken.

The other side of the argument that this discourages quirks is that
this interface actually makes it significantly easier to report specific
reset methods as broken for a given device.

Thanks,
Alex