Re: [PATCH v4 0/7] vdpa: decouple reset of iotlb mapping from device reset

From: Lei Yang
Date: Thu Oct 26 2023 - 02:17:54 EST


On Thu, Oct 26, 2023 at 7:32 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
>
> Hi Yang Lei,
>
> Thanks for testing my patches and reporting! As for the issue, could you
> please try what I posted in:
>
> https://lore.kernel.org/virtualization/1698275594-19204-1-git-send-email-si-wei.liu@xxxxxxxxxx/
>
HI Si-Wei
> and let me know how it goes? Thank you very much!

This problem has gone after applying this patch [1].
[1] https://lore.kernel.org/virtualization/1698275594-19204-1-git-send-email-si-wei.liu@xxxxxxxxxx/

Thanks
Lei
>
> Thanks,
> -Siwei
>
> On 10/25/2023 2:41 AM, Lei Yang wrote:
> > On Wed, Oct 25, 2023 at 1:27 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
> > Hello Si-Wei
> >> Thanks a lot for testing! Please be aware that there's a follow-up fix
> >> for a potential oops in this v4 series:
> >>
> > The first, when I did not apply this patch [1], I will also hit this
> > patch mentioned problem. After I applied this patch, this problem will
> > no longer to hit again. But I hit another issues, about the error
> > messages please review the attached file.
> > [1] https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx/
> >
> > My test steps:
> > git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> > cd linux/
> > b4 am 1697880319-4937-1-git-send-email-si-wei.liu@xxxxxxxxxx
> > b4 am 20231018171456.1624030-2-dtatulea@xxxxxxxxxx
> > b4 am 1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx
> > git am ./v4_20231018_dtatulea_vdpa_add_support_for_vq_descriptor_mappings.mbx
> > git am ./v4_20231021_si_wei_liu_vdpa_decouple_reset_of_iotlb_mapping_from_device_reset.mbx
> > git am ./20231023_si_wei_liu_vhost_vdpa_fix_null_pointer_deref_in__compat_vdpa_reset.mbx
> > cp /boot/config-5.14.0-377.el9.x86_64 .config
> > make -j 32
> > make modules_install
> > make install
> >
> > Thanks
> >
> > Lei
> >> https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx/
> >>
> >> Would be nice to have it applied for any tests.
> >>
> >> Thanks,
> >> -Siwei
> >>
> >> On 10/23/2023 11:51 PM, Lei Yang wrote:
> >>> QE tested this series v4 with regression testing on real nic, there is
> >>> no new regression bug.
> >>>
> >>> Tested-by: Lei Yang <leiyang@xxxxxxxxxx>
> >>>
> >>> On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
> >>>>
> >>>> On 10/22/2023 8:51 PM, Jason Wang wrote:
> >>>>> Hi Si-Wei:
> >>>>>
> >>>>> On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
> >>>>>> In order to reduce needlessly high setup and teardown cost
> >>>>>> of iotlb mapping during live migration, it's crucial to
> >>>>>> decouple the vhost-vdpa iotlb abstraction from the virtio
> >>>>>> device life cycle, i.e. iotlb mappings should be left
> >>>>>> intact across virtio device reset [1]. For it to work, the
> >>>>>> on-chip IOMMU parent device could implement a separate
> >>>>>> .reset_map() operation callback to restore 1:1 DMA mapping
> >>>>>> without having to resort to the .reset() callback, the
> >>>>>> latter of which is mainly used to reset virtio device state.
> >>>>>> This new .reset_map() callback will be invoked only before
> >>>>>> the vhost-vdpa driver is to be removed and detached from
> >>>>>> the vdpa bus, such that other vdpa bus drivers, e.g.
> >>>>>> virtio-vdpa, can start with 1:1 DMA mapping when they
> >>>>>> are attached. For the context, those on-chip IOMMU parent
> >>>>>> devices, create the 1:1 DMA mapping at vdpa device creation,
> >>>>>> and they would implicitly destroy the 1:1 mapping when
> >>>>>> the first .set_map or .dma_map callback is invoked.
> >>>>>>
> >>>>>> This patchset is rebased on top of the latest vhost tree.
> >>>>>>
> >>>>>> [1] Reducing vdpa migration downtime because of memory pin / maps
> >>>>>> https://www.mail-archive.com/qemu-devel@xxxxxxxxxx/msg953755.html
> >>>>>>
> >>>>>> ---
> >>>>>> v4:
> >>>>>> - Rework compatibility using new .compat_reset driver op
> >>>>> I still think having a set_backend_feature()
> >>>> This will overload backend features with the role of carrying over
> >>>> compatibility quirks, which I tried to avoid from. While I think the
> >>>> .compat_reset from the v4 code just works with the backend features
> >>>> acknowledgement (and maybe others as well) to determine, but not
> >>>> directly tie it to backend features itself. These two have different
> >>>> implications in terms of requirement, scope and maintaining/deprecation,
> >>>> better to cope with compat quirks in explicit and driver visible way.
> >>>>
> >>>>> or reset_map(clean=true) might be better.
> >>>> An explicit op might be marginally better in driver writer's point of
> >>>> view. Compliant driver doesn't have to bother asserting clean_map never
> >>>> be true so their code would never bother dealing with this case, as
> >>>> explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map
> >>>> during reset for older userspace":
> >>>>
> >>>> "
> >>>> The separation of .compat_reset from the regular .reset allows
> >>>> vhost-vdpa able to know which driver had broken behavior before, so it
> >>>> can apply the corresponding compatibility quirk to the individual
> >>>> driver
> >>>> whenever needed. Compared to overloading the existing .reset with
> >>>> flags, .compat_reset won't cause any extra burden to the implementation
> >>>> of every compliant driver.
> >>>> "
> >>>>
> >>>>> As it tries hard to not introduce new stuff on the bus.
> >>>> Honestly I don't see substantial difference between these other than the
> >>>> color. There's no single best solution that stands out among the 3. And
> >>>> I assume you already noticed it from all the above 3 approaches will
> >>>> have to go with backend features negotiation, that the 1st vdpa reset
> >>>> before backend feature negotiation will use the compliant version of
> >>>> .reset that doesn't clean up the map. While I don't think this nuance
> >>>> matters much to existing older userspace apps, as the maps should
> >>>> already get cleaned by previous process in vhost_vdpa_cleanup(), but if
> >>>> bug-for-bug behavioral compatibility is what you want, module parameter
> >>>> will be the single best answer.
> >>>>
> >>>> Regards,
> >>>> -Siwei
> >>>>
> >>>>> But we can listen to others for sure.
> >>>>>
> >>>>> Thanks
> >>>>>
>