RE: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to vfio

From: Shameerali Kolothum Thodi
Date: Thu May 13 2021 - 14:31:46 EST




> -----Original Message-----
> From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx]
> Sent: 13 May 2021 19:23
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>; liulongfang
> <liulongfang@xxxxxxxxxx>; cohuck@xxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx; linuxarm@xxxxxxxxxxxxx
> Subject: Re: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to
> vfio
>
> On Thu, 13 May 2021 17:52:56 +0000
> Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> wrote:
>
> > Hi Alex,
> >
> > > -----Original Message-----
> > > From: Alex Williamson [mailto:alex.williamson@xxxxxxxxxx]
> > > Sent: 13 May 2021 18:04
> > > To: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@xxxxxxxxxx>
> > > Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>; liulongfang
> > > <liulongfang@xxxxxxxxxx>; cohuck@xxxxxxxxxx;
> > > linux-kernel@xxxxxxxxxxxxxxx; linuxarm@xxxxxxxxxxxxx
> > > Subject: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the driver to
> > > vfio
> > >
> > > On Thu, 13 May 2021 15:49:25 +0000
> > > Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> > > wrote:
> > >
> > > > > -----Original Message-----
> > > > > From: Jason Gunthorpe [mailto:jgg@xxxxxxxxxx]
> > > > > Sent: 13 May 2021 14:44
> > > > > To: liulongfang <liulongfang@xxxxxxxxxx>
> > > > > Cc: Alex Williamson <alex.williamson@xxxxxxxxxx>;
> cohuck@xxxxxxxxxx;
> > > > > linux-kernel@xxxxxxxxxxxxxxx; linuxarm@xxxxxxxxxxxxx
> > > > > Subject: [Linuxarm] Re: [RFC PATCH 2/3] vfio/hisilicon: register the
> driver to
> > > > > vfio
> > > > >
> > > > > On Thu, May 13, 2021 at 10:08:28AM +0800, liulongfang wrote:
> > > > > > On 2021/5/12 20:10, Jason Gunthorpe wrote:
> > > > > > > On Wed, May 12, 2021 at 04:39:43PM +0800, liulongfang wrote:
> > > > > > >
> > > > > > >> Therefore, this method of limiting the length of the BAR
> > > > > > >> configuration space can prevent unsafe operations of the memory.
> > > > > > >
> > > > > > > The issue is DMA controlled by the guest accessing the secure BAR
> > > > > > > area, not the guest CPU.
> > > > > > >
> > > > > > > Jason
> > > > > > > .
> > > > > > >
> > > > > > This secure BAR area is not presented to the Guest,
> > > > > > which makes it impossible for the Guest to obtain the secure BAR area
> > > > > > when establishing the DMA mapping of the configuration space.
> > > > > > If the DMA controller accesses the secure BAR area, the access will
> > > > > > be blocked by the SMMU.
> > > > >
> > > > > There are scenarios where this is not true.
> > > > >
> > > > > At a minimum the mdev driver should refuse to work in those cases.
> > > > >
> > > >
> > > > Hi,
> > > >
> > > > I think the idea here is not a generic solution, but a quirk for this specific
> dev.
> > > >
> > > > Something like,
> > > >
> > > > --- a/drivers/vfio/pci/vfio_pci.c
> > > > +++ b/drivers/vfio/pci/vfio_pci.c
> > > > @@ -866,7 +866,12 @@ static long vfio_pci_ioctl(struct vfio_device
> > > *core_vdev,
> > > > break;
> > > > case VFIO_PCI_BAR0_REGION_INDEX ...
> > > VFIO_PCI_BAR5_REGION_INDEX:
> > > > info.offset =
> > > VFIO_PCI_INDEX_TO_OFFSET(info.index);
> > > > - info.size = pci_resource_len(pdev, info.index);
> > > > +
> > > > + if (check_hisi_acc_quirk(pdev, info))
> > > > + info.size = new_size;// BAR is
> limited
> > > without migration region.
> > > > + else
> > > > + info.size = pci_resource_len(pdev,
> > > info.index);
> > > > +
> > > > if (!info.size) {
> > > > info.flags = 0;
> > > > break;
> > > >
> > > > Is this an acceptable/workable solution here?
> > >
> > > As Jason says, this only restricts CPU access to the BAR, the issue is
> > > DMA access. As the hardware vendor you may be able to guarantee that
> > > a DMA transaction generated by the device targeting the remainder of
> > > the BAR will always go upstream, but can you guarantee the routing
> > > between the device and the SMMU? For instance if this device can be
> > > implemented as a plugin card, then it can be installed into a
> > > downstream port that may not support ACS. That downstream port may
> > > implement request redirection allowing the transaction to reflect back
> > > to the device without IOMMU translation. At that point the userspace
> > > driver can target the kernel driver half of the BAR and potentially
> > > expose a security risk. Thanks,
> >
> > The ACC devices on this platform are not pluggable devices. They are
> exposed
> > as integrated endpoint devices. So I am not sure the above concern is valid in
> this
> > case.
> >
> > I had a look at the userspace driver approach you suggested. But
> unfortunately
> > the migration state change for the vf has to check some of the pf registers for
> > confirming the state. So even if we move the implementation to Qemu, we
> > still may have to use the migration uAPI to access the pf device registers.
> >
> > Since the devices we are concerned here are all integrated endpoints and if
> the
> > above quirk is an acceptable one, then we can use the uAPI as done in this
> > series without overly complicating things here.
>
> If you expect this device to appear only as an integrated endpoint, then
> I think Jason's suggestion above is correct. Your driver that supports
> migration can refuse to load for devices there the topology is other
> than expected and you're effectively guaranteeing DMA isolation of the
> user and in-kernel drivers by hardware DMA semantics and topology.

Ok. Will take a look at the recommendation.

> Requiring access to the PF to support the migration protocol also
> suggests that an in-kernel driver to support migration is our best
> option.

Thanks,
Shameer