RE: [PATCH v3 0/6] vfio/hisilicon: add acc live migration driver

From: Tian, Kevin
Date: Wed Sep 29 2021 - 20:42:26 EST


> From: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@xxxxxxxxxx>
>
> > From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> > Sent: 29 September 2021 10:06
> >
> > > From: Shameerali Kolothum Thodi
> > > <shameerali.kolothum.thodi@xxxxxxxxxx>
> > >
> > > Hi Kevin,
> > >
> > > > From: Tian, Kevin [mailto:kevin.tian@xxxxxxxxx]
> > > > Sent: 29 September 2021 04:58
> > > >
> > > > Hi, Shameer,
> > > >
> > > > > From: Shameer Kolothum <shameerali.kolothum.thodi@xxxxxxxxxx>
> > > > > Sent: Wednesday, September 15, 2021 5:51 PM
> > > > >
> > > > > Hi,
> > > > >
> > > > > Thanks to the introduction of vfio_pci_core subsystem framework[0],
> > > > > now it is possible to provide vendor specific functionality to
> > > > > vfio pci devices. This series attempts to add vfio live migration
> > > > > support for HiSilicon ACC VF devices based on the new framework.
> > > > >
> > > > > HiSilicon ACC VF device MMIO space includes both the functional
> > > > > register space and migration control register space. As discussed
> > > > > in RFCv1[1], this may create security issues as these regions get
> > > > > shared between the Guest driver and the migration driver.
> > > > > Based on the feedback, we tried to address those concerns in
> > > > > this version.
> > > >
> > > > This series doesn't mention anything related to dirty page tracking.
> > > > Are you rely on Keqian's series for utilizing hardware iommu dirty
> > > > bit (e.g. SMMU HTTU)?
> > >
> > > Yes, this doesn't have dirty page tracking and the plan is to make use of
> > > Keqian's SMMU HTTU work to improve performance. We have done basic
> > > sanity testing with those patches.
> > >
> >
> > Do you plan to support migration w/o HTTU as the fallback option?
> > Generally one would expect the basic functionality ready before talking
> > about optimization.
>
> Yes, the plan is to get the basic live migration working and then we can
> optimize
> it with SMMU HTTU when it is available.

The interesting thing is that w/o HTTU vfio will just report every pinned
page as dirty, i.e. the entire guest memory is dirty. This completely kills
the benefit of precopy phase since Qemu still needs to transfer the entire
guest memory in the stop-copy phase. This is not a 'working' model for
live migration.

So it needs to be clear whether HTTU is really an optimization or
a hard functional-requirement for migrating such device. If the latter
the migration region info is not a nice-to-have thing.

btw the fallback option that I raised earlier is more like some software
mitigation for collecting dirty pages, e.g. analyzing the ring descriptors
to build software-tracked dirty info by mediating the cmd portal
(which requires dynamically unmapping cmd portal from the fast-path
to enable mediation). We are looking into this option for some platform
which lacks of IOMMU dirty bit support.

Thanks
Kevin