Re: [PATCH v7 07/10] vfio: Extend the device migration protocol with PRE_COPY

From: Alex Williamson
Date: Wed Mar 02 2022 - 22:52:46 EST


On Wed, 2 Mar 2022 20:05:28 -0400
Jason Gunthorpe <jgg@xxxxxxxxxx> wrote:

> On Wed, Mar 02, 2022 at 01:31:59PM -0700, Alex Williamson wrote:
> > > + * initial_bytes reflects the estimated remaining size of any initial mandatory
> > > + * precopy data transfer. When initial_bytes returns as zero then the initial
> > > + * phase of the precopy data is completed. Generally initial_bytes should start
> > > + * out as approximately the entire device state.
> >
> > What is "mandatory" intended to mean here? The user isn't required to
> > collect any data from the device in the PRE_COPY states.
>
> If the data is split into initial,dirty,trailer then mandatory means
> that first chunk.

But there's no requirement to read anything in PRE_COPY, so initial
becomes indistinguishable from trailer and dirty doesn't exist.

> > "The vfio_precopy_info data structure returned by this ioctl provides
> > estimates of data available from the device during the PRE_COPY states.
> > This estimate is split into two categories, initial_bytes and
> > dirty_bytes.
> >
> > The initial_bytes field indicates the amount of static data available
> > from the device. This field should have a non-zero initial value and
> > decrease as migration data is read from the device.
>
> static isn't great either, how about just say 'minimum data available'

'initial precopy data-set'?

> > Userspace may use the combination of these fields to estimate the
> > potential data size available during the PRE_COPY phases, as well as
> > trends relative to the rate the device is dirtying it's internal
> > state, but these fields are not required to have any bearing relative
> > to the data size available during the STOP_COPY phase."
>
> That last is too strong. I would just drop starting at but.
>
> The message to communicate is the device should allow dirty_bytes to
> reach 0 during the PRE_COPY phases if everything is is idle. Which
> tells alot about how to calculate it.
>
> It is all better otherwise
>
> > > + * Drivers should attempt to return estimates so that initial_bytes +
> > > + * dirty_bytes matches the amount of data an immediate transition to STOP_COPY
> > > + * will require to be streamed.
> >
> > I think previous discussions have proven this false, we expect trailing
> > data that is only available in STOP_COPY, we cannot bound the size of
> > that data, and dirty_bytes is not intended to expose data that cannot
> > be retrieved during the PRE_COPY phases. Thanks,
>
> It was written assuming the stop_copy trailer is small.

We have no basis to make that assertion. We've agreed that precopy can
be used for nothing more than a compatibility test, so we could have a
vGPU with a massive framebuffer and no ability to provide dirty
tracking implement precopy only to include the entire framebuffer in
the trailing STOP_COPY data set. Per my understanding and the fact
that we cannot enforce any heuristics regarding the size of the tailer
relative to the pre-copy data set, I think the above strongly phrased
sentence is necessary to understand the limitations of what this ioctl
is meant to convey. Thanks,

Alex