Re: [PATCH v7 09/10] hisi_acc_vfio_pci: Add support for VFIO live migration

From: Jason Gunthorpe
Date: Thu Mar 03 2022 - 08:04:27 EST


On Thu, Mar 03, 2022 at 12:57:29PM +0000, Shameerali Kolothum Thodi wrote:
>
>
> > From: Jason Gunthorpe [mailto:jgg@xxxxxxxxxx]
> > Sent: 03 March 2022 00:22
> > To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>
> > Cc: kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > linux-crypto@xxxxxxxxxxxxxxx; linux-pci@xxxxxxxxxxxxxxx;
> > alex.williamson@xxxxxxxxxx; cohuck@xxxxxxxxxx; mgurtovoy@xxxxxxxxxx;
> > yishaih@xxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; liulongfang
> > <liulongfang@xxxxxxxxxx>; Zengtao (B) <prime.zeng@xxxxxxxxxxxxx>;
> > Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>; Wangzhou (B)
> > <wangzhou1@xxxxxxxxxxxxx>
> > Subject: Re: [PATCH v7 09/10] hisi_acc_vfio_pci: Add support for VFIO live
> > migration
> >
> > On Wed, Mar 02, 2022 at 05:29:02PM +0000, Shameer Kolothum wrote:
> > > +static long hisi_acc_vf_save_unl_ioctl(struct file *filp,
> > > + unsigned int cmd, unsigned long arg)
> > > +{
> > > + struct hisi_acc_vf_migration_file *migf = filp->private_data;
> > > + struct hisi_acc_vf_core_device *hisi_acc_vdev = container_of(migf,
> > > + struct hisi_acc_vf_core_device, saving_migf);
> > > + loff_t *pos = &filp->f_pos;
> > > + struct vfio_precopy_info info;
> > > + unsigned long minsz;
> > > + int ret;
> > > +
> > > + if (cmd != VFIO_MIG_GET_PRECOPY_INFO)
> > > + return -ENOTTY;
> > > +
> > > + minsz = offsetofend(struct vfio_precopy_info, dirty_bytes);
> > > +
> > > + if (copy_from_user(&info, (void __user *)arg, minsz))
> > > + return -EFAULT;
> > > + if (info.argsz < minsz)
> > > + return -EINVAL;
> > > +
> > > + mutex_lock(&hisi_acc_vdev->state_mutex);
> > > + if (hisi_acc_vdev->mig_state != VFIO_DEVICE_STATE_PRE_COPY) {
> > > + mutex_unlock(&hisi_acc_vdev->state_mutex);
> > > + return -EINVAL;
> > > + }
> >
> > IMHO it is easier just to check the total_length and not grab this
> > other lock
>
> The problem with checking the total_length here is that it is possible that
> in STOP_COPY the dev is not ready and there are no more data to be transferred
> and the total_length remains at QM_MATCH_SIZE.

Tthere is a scenario that transfers only QM_MATCH_SIZE in stop_copy?
This doesn't seem like a good idea, I think you should transfer a
positive indication 'this device is not ready' instead of truncating
the stream. A truncated stream should not be a valid stream.

ie always transfer the whole struct.

> Looks like setting the total_length = 0 in STOP_COPY is a better solution(If there are
> no other issues with that) as it will avoid grabbing the state_mutex as you
> mentioned above.

That seems really weird, I wouldn't recommend doing that..

Kaspm