Re: [PATCH 4/4] vfio-pci/zdev: Introduce the zPCI I/O vfio region

From: Cornelia Huck
Date: Tue Jan 26 2021 - 02:25:06 EST


On Mon, 25 Jan 2021 09:40:38 -0500
Matthew Rosato <mjrosato@xxxxxxxxxxxxx> wrote:

> On 1/22/21 6:48 PM, Alex Williamson wrote:
> > On Tue, 19 Jan 2021 15:02:30 -0500
> > Matthew Rosato <mjrosato@xxxxxxxxxxxxx> wrote:
> >
> >> Some s390 PCI devices (e.g. ISM) perform I/O operations that have very
> >> specific requirements in terms of alignment as well as the patterns in
> >> which the data is read/written. Allowing these to proceed through the
> >> typical vfio_pci_bar_rw path will cause them to be broken in up in such a
> >> way that these requirements can't be guaranteed. In addition, ISM devices
> >> do not support the MIO codepaths that might be triggered on vfio I/O coming
> >> from userspace; we must be able to ensure that these devices use the
> >> non-MIO instructions. To facilitate this, provide a new vfio region by
> >> which non-MIO instructions can be passed directly to the host kernel s390
> >> PCI layer, to be reliably issued as non-MIO instructions.
> >>
> >> This patch introduces the new vfio VFIO_REGION_SUBTYPE_IBM_ZPCI_IO region
> >> and implements the ability to pass PCISTB and PCILG instructions over it,
> >> as these are what is required for ISM devices.
> >
> > There have been various discussions about splitting vfio-pci to allow
> > more device specific drivers rather adding duct tape and bailing wire
> > for various device specific features to extend vfio-pci. The latest
> > iteration is here[1]. Is it possible that such a solution could simply
> > provide the standard BAR region indexes, but with an implementation that
> > works on s390, rather than creating new device specific regions to
> > perform the same task? Thanks,
> >
> > Alex
> >
> > [1]https://lore.kernel.org/lkml/20210117181534.65724-1-mgurtovoy@xxxxxxxxxx/
> >
>
> Thanks for the pointer, I'll have to keep an eye on this. An approach
> like this could solve some issues, but I think a main issue that still
> remains with relying on the standard BAR region indexes (whether using
> the current vfio-pci driver or a device-specific driver) is that QEMU
> writes to said BAR memory region are happening in, at most, 8B chunks
> (which then, in the current general-purpose vfio-pci code get further
> split up into 4B iowrite operations). The alternate approach I'm
> proposing here is allowing for the whole payload (4K) in a single
> operation, which is significantly faster. So, I suspect even with a
> device specific driver we'd want this sort of a region anyhow..

I'm also wondering about device specific vs architecture/platform
specific handling.

If we're trying to support ISM devices, that's device specific
handling; but if we're trying to add more generic things like the large
payload support, that's not necessarily tied to a device, is it? For
example, could a device support large payload if plugged into a z, but
not if plugged into another machine?