Re: [PATCH v4 0/7] vdpa: decouple reset of iotlb mapping from device reset

From: Lei Yang
Date: Wed Oct 25 2023 - 05:42:58 EST


On Wed, Oct 25, 2023 at 1:27 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
>
Hello Si-Wei
> Thanks a lot for testing! Please be aware that there's a follow-up fix
> for a potential oops in this v4 series:
>
The first, when I did not apply this patch [1], I will also hit this
patch mentioned problem. After I applied this patch, this problem will
no longer to hit again. But I hit another issues, about the error
messages please review the attached file.
[1] https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx/

My test steps:
git clone https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
cd linux/
b4 am 1697880319-4937-1-git-send-email-si-wei.liu@xxxxxxxxxx
b4 am 20231018171456.1624030-2-dtatulea@xxxxxxxxxx
b4 am 1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx
git am ./v4_20231018_dtatulea_vdpa_add_support_for_vq_descriptor_mappings.mbx
git am ./v4_20231021_si_wei_liu_vdpa_decouple_reset_of_iotlb_mapping_from_device_reset.mbx
git am ./20231023_si_wei_liu_vhost_vdpa_fix_null_pointer_deref_in__compat_vdpa_reset.mbx
cp /boot/config-5.14.0-377.el9.x86_64 .config
make -j 32
make modules_install
make install

Thanks

Lei
> https://lore.kernel.org/virtualization/1698102863-21122-1-git-send-email-si-wei.liu@xxxxxxxxxx/
>
> Would be nice to have it applied for any tests.
>
> Thanks,
> -Siwei
>
> On 10/23/2023 11:51 PM, Lei Yang wrote:
> > QE tested this series v4 with regression testing on real nic, there is
> > no new regression bug.
> >
> > Tested-by: Lei Yang <leiyang@xxxxxxxxxx>
> >
> > On Tue, Oct 24, 2023 at 6:02 AM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
> >>
> >>
> >> On 10/22/2023 8:51 PM, Jason Wang wrote:
> >>> Hi Si-Wei:
> >>>
> >>> On Sat, Oct 21, 2023 at 5:28 PM Si-Wei Liu <si-wei.liu@xxxxxxxxxx> wrote:
> >>>> In order to reduce needlessly high setup and teardown cost
> >>>> of iotlb mapping during live migration, it's crucial to
> >>>> decouple the vhost-vdpa iotlb abstraction from the virtio
> >>>> device life cycle, i.e. iotlb mappings should be left
> >>>> intact across virtio device reset [1]. For it to work, the
> >>>> on-chip IOMMU parent device could implement a separate
> >>>> .reset_map() operation callback to restore 1:1 DMA mapping
> >>>> without having to resort to the .reset() callback, the
> >>>> latter of which is mainly used to reset virtio device state.
> >>>> This new .reset_map() callback will be invoked only before
> >>>> the vhost-vdpa driver is to be removed and detached from
> >>>> the vdpa bus, such that other vdpa bus drivers, e.g.
> >>>> virtio-vdpa, can start with 1:1 DMA mapping when they
> >>>> are attached. For the context, those on-chip IOMMU parent
> >>>> devices, create the 1:1 DMA mapping at vdpa device creation,
> >>>> and they would implicitly destroy the 1:1 mapping when
> >>>> the first .set_map or .dma_map callback is invoked.
> >>>>
> >>>> This patchset is rebased on top of the latest vhost tree.
> >>>>
> >>>> [1] Reducing vdpa migration downtime because of memory pin / maps
> >>>> https://www.mail-archive.com/qemu-devel@xxxxxxxxxx/msg953755.html
> >>>>
> >>>> ---
> >>>> v4:
> >>>> - Rework compatibility using new .compat_reset driver op
> >>> I still think having a set_backend_feature()
> >> This will overload backend features with the role of carrying over
> >> compatibility quirks, which I tried to avoid from. While I think the
> >> .compat_reset from the v4 code just works with the backend features
> >> acknowledgement (and maybe others as well) to determine, but not
> >> directly tie it to backend features itself. These two have different
> >> implications in terms of requirement, scope and maintaining/deprecation,
> >> better to cope with compat quirks in explicit and driver visible way.
> >>
> >>> or reset_map(clean=true) might be better.
> >> An explicit op might be marginally better in driver writer's point of
> >> view. Compliant driver doesn't have to bother asserting clean_map never
> >> be true so their code would never bother dealing with this case, as
> >> explained in the commit log for patch 5 "vhost-vdpa: clean iotlb map
> >> during reset for older userspace":
> >>
> >> "
> >> The separation of .compat_reset from the regular .reset allows
> >> vhost-vdpa able to know which driver had broken behavior before, so it
> >> can apply the corresponding compatibility quirk to the individual
> >> driver
> >> whenever needed. Compared to overloading the existing .reset with
> >> flags, .compat_reset won't cause any extra burden to the implementation
> >> of every compliant driver.
> >> "
> >>
> >>> As it tries hard to not introduce new stuff on the bus.
> >> Honestly I don't see substantial difference between these other than the
> >> color. There's no single best solution that stands out among the 3. And
> >> I assume you already noticed it from all the above 3 approaches will
> >> have to go with backend features negotiation, that the 1st vdpa reset
> >> before backend feature negotiation will use the compliant version of
> >> .reset that doesn't clean up the map. While I don't think this nuance
> >> matters much to existing older userspace apps, as the maps should
> >> already get cleaned by previous process in vhost_vdpa_cleanup(), but if
> >> bug-for-bug behavioral compatibility is what you want, module parameter
> >> will be the single best answer.
> >>
> >> Regards,
> >> -Siwei
> >>
> >>> But we can listen to others for sure.
> >>>
> >>> Thanks
> >>>
>

Attachment: log
Description: Binary data