Re: [PATCH] vhost/vdpa: Add MSI translation tables to iommu for software-managed MSI

From: Jason Wang
Date: Mon Jan 30 2023 - 22:07:22 EST


On Tue, Jan 31, 2023 at 9:32 AM Nanyong Sun <sunnanyong@xxxxxxxxxx> wrote:
>
> On 2023/1/29 14:02, Jason Wang wrote:
> > On Sat, Jan 28, 2023 at 10:25 AM Nanyong Sun <sunnanyong@xxxxxxxxxx> wrote:
> >> From: Rong Wang <wangrong68@xxxxxxxxxx>
> >>
> >> Once enable iommu domain for one device, the MSI
> >> translation tables have to be there for software-managed MSI.
> >> Otherwise, platform with software-managed MSI without an
> >> irq bypass function, can not get a correct memory write event
> >> from pcie, will not get irqs.
> >> The solution is to obtain the MSI phy base address from
> >> iommu reserved region, and set it to iommu MSI cookie,
> >> then translation tables will be created while request irq.
> >>
> >> Signed-off-by: Rong Wang <wangrong68@xxxxxxxxxx>
> >> Signed-off-by: Nanyong Sun <sunnanyong@xxxxxxxxxx>
> >> ---
> >> drivers/iommu/iommu.c | 1 +
> >> drivers/vhost/vdpa.c | 53 ++++++++++++++++++++++++++++++++++++++++---
> >> 2 files changed, 51 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> >> index de91dd88705b..f6c65d5d8e2b 100644
> >> --- a/drivers/iommu/iommu.c
> >> +++ b/drivers/iommu/iommu.c
> >> @@ -2623,6 +2623,7 @@ void iommu_get_resv_regions(struct device *dev, struct list_head *list)
> >> if (ops->get_resv_regions)
> >> ops->get_resv_regions(dev, list);
> >> }
> >> +EXPORT_SYMBOL_GPL(iommu_get_resv_regions);
> >>
> >> /**
> >> * iommu_put_resv_regions - release resered regions
> >> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> >> index ec32f785dfde..31d3e9ed4cfa 100644
> >> --- a/drivers/vhost/vdpa.c
> >> +++ b/drivers/vhost/vdpa.c
> >> @@ -1103,6 +1103,48 @@ static ssize_t vhost_vdpa_chr_write_iter(struct kiocb *iocb,
> >> return vhost_chr_write_iter(dev, from);
> >> }
> >>
> >> +static bool vhost_vdpa_check_sw_msi(struct list_head *dev_resv_regions, phys_addr_t *base)
> >> +{
> >> + struct iommu_resv_region *region;
> >> + bool ret = false;
> >> +
> >> + list_for_each_entry(region, dev_resv_regions, list) {
> >> + /*
> >> + * The presence of any 'real' MSI regions should take
> >> + * precedence over the software-managed one if the
> >> + * IOMMU driver happens to advertise both types.
> >> + */
> >> + if (region->type == IOMMU_RESV_MSI) {
> >> + ret = false;
> >> + break;
> >> + }
> >> +
> >> + if (region->type == IOMMU_RESV_SW_MSI) {
> >> + *base = region->start;
> >> + ret = true;
> >> + }
> >> + }
> >> +
> >> + return ret;
> >> +}
> > Can we unify this with what VFIO had?
> Yes, these two functions are just the same.
> Do you think move this function to iommu.c, and export from iommu is a
> good choice?

Probably, we can try and see.

> >
> >> +
> >> +static int vhost_vdpa_get_msi_cookie(struct iommu_domain *domain, struct device *dma_dev)
> >> +{
> >> + struct list_head dev_resv_regions;
> >> + phys_addr_t resv_msi_base = 0;
> >> + int ret = 0;
> >> +
> >> + INIT_LIST_HEAD(&dev_resv_regions);
> >> + iommu_get_resv_regions(dma_dev, &dev_resv_regions);
> >> +
> >> + if (vhost_vdpa_check_sw_msi(&dev_resv_regions, &resv_msi_base))
> >> + ret = iommu_get_msi_cookie(domain, resv_msi_base);
> >> +
> >> + iommu_put_resv_regions(dma_dev, &dev_resv_regions);
> >> +
> >> + return ret;
> >> +}
> >> +
> >> static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
> >> {
> >> struct vdpa_device *vdpa = v->vdpa;
> >> @@ -1128,11 +1170,16 @@ static int vhost_vdpa_alloc_domain(struct vhost_vdpa *v)
> >>
> >> ret = iommu_attach_device(v->domain, dma_dev);
> >> if (ret)
> >> - goto err_attach;
> >> + goto err_alloc_domain;
> >>
> >> - return 0;
> >> + ret = vhost_vdpa_get_msi_cookie(v->domain, dma_dev);
> > Do we need to check the overlap mapping and record it in the interval
> > tree (as what VFIO did)?
> >
> > Thanks
> Yes, we need to care about this part, I will handle this recently.
> Thanks a lot.

I think for parents that requires vendor specific mapping logic we
probably also need this.

But this could be added on top (via a new config ops probably).

Thanks

> >> + if (ret)
> >> + goto err_attach_device;
> >>
> >> -err_attach:
> >> + return 0;
> >> +err_attach_device:
> >> + iommu_detach_device(v->domain, dma_dev);
> >> +err_alloc_domain:
> >> iommu_domain_free(v->domain);
> >> return ret;
> >> }
> >> --
> >> 2.25.1
> >>
> > .
>