Fwd: [PATCH 0/5] s390/pci: automatic error recovery

From: Linas Vepstas
Date: Mon Sep 06 2021 - 22:10:57 EST


Ooops, try again without the html. --linas

---------- Forwarded message ---------
From: Linas Vepstas <linasvepstas@xxxxxxxxx>
Date: Mon, Sep 6, 2021 at 9:05 PM
Subject: Re: [PATCH 0/5] s390/pci: automatic error recovery
To: Niklas Schnelle <schnelle@xxxxxxxxxxxxx>
Cc: Bjorn Helgaas <bhelgaas@xxxxxxxxxx>, Oliver O'Halloran
<oohall@xxxxxxxxx>, Russell Currey <ruscur@xxxxxxxxxx>,
<linuxppc-dev@xxxxxxxxxxxxxxxx>, linux-kernel@xxxxxxxxxxxxxxx
<linux-kernel@xxxxxxxxxxxxxxx>, <linux-s390@xxxxxxxxxxxxxxx>, Matthew
Rosato <mjrosato@xxxxxxxxxxxxx>, Pierre Morel <pmorel@xxxxxxxxxxxxx>




On Mon, Sep 6, 2021 at 4:49 AM Niklas Schnelle <schnelle@xxxxxxxxxxxxx> wrote:
>
> I believe we might be the first
> implementation of PCI device recovery in a virtualized setting requiring us to
> coordinate the device reset with the hypervisor platform by issuing a disable
> and re-enable to the platform as well as starting the recovery following
> a platform event.


I recall none of the details, but SRIOV is a standardized system for
sharing a PCI device across multiple virtual machines. It has detailed
info on what the hypervisor must do, and what the local OS instance
must do to accomplish this. It's part of the PCI standard, and its
more than a decade old now, maybe two. Being a part of the PCI
standard, it was interoperable with error recovery, to the best of my
recollection. At the time it was introduced, it got pushed very
aggressively. The x86 hypervisor vendors were aiming at the heart of
zseries, and were militant about it.

-- Linas

--
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.




--
Patrick: Are they laughing at us?
Sponge Bob: No, Patrick, they are laughing next to us.