Re: [RFC PATCH] iommu/vt-d: Fix IOMMU field not populated on device hot re-plug

From: Janusz Krzysztofik
Date: Mon Sep 02 2019 - 04:38:04 EST


Hi Baolu,

On Thursday, August 29, 2019 11:08:18 AM CEST Lu Baolu wrote:
> Hi,
>
> On 8/29/19 3:58 PM, Janusz Krzysztofik wrote:
> > Hi Baolu,
> >
> > On Thursday, August 29, 2019 3:43:31 AM CEST Lu Baolu wrote:
> >> Hi Janusz,
> >>
> >> On 8/28/19 10:17 PM, Janusz Krzysztofik wrote:
> >>>> We should avoid kernel panic when a intel_unmap() is called against
> >>>> a non-existent domain.
> >>> Does that mean you suggest to replace
> >>> BUG_ON(!domain);
> >>> with something like
> >>> if (WARN_ON(!domain))
> >>> return;
> >>> and to not care of orphaned mappings left allocated? Is there a way to
> > inform
> >>> users that their active DMA mappings are no longer valid and they
> > shouldn't
> >>> call dma_unmap_*()?
> >>>
> >>>> But we shouldn't expect the IOMMU driver not
> >>>> cleaning up the domain info when a device remove notification comes and
> >>>> wait until all file descriptors being closed, right?
> >>> Shouldn't then the IOMMU driver take care of cleaning up resources still
> >>> allocated on device remove before it invalidates and forgets their
> > pointers?
> >>>
> >>
> >> You are right. We need to wait until all allocated resources (iova and
> >> mappings) to be released.
> >>
> >> How about registering a callback for BUS_NOTIFY_UNBOUND_DRIVER, and
> >> removing the domain info when the driver detachment completes?
> >
> > Device core calls BUS_NOTIFY_UNBOUND_DRIVER on each driver unbind,
regardless
> > of a device being removed or not. As long as the device is not unplugged
and
> > the BUS_NOTIFY_REMOVED_DEVICE notification not generated, an unbound
driver is
> > not a problem here.
> > Morever, BUS_NOTIFY_UNBOUND_DRIVER is called even before
> > BUS_NOTIFY_REMOVED_DEVICE so that wouldn't help anyway.
> > Last but not least, bus events are independent of the IOMMU driver use via
> > DMA-API it exposes.
>
> Fair enough.
>
> >
> > If keeping data for unplugged devices and reusing it on device re-plug is
not
> > acceptable then maybe the IOMMU driver should perform reference counting
of
> > its internal resources occupied by DMA-API users and perform cleanups on
last
> > release?
>
> I am not saying that keeping data is not acceptable. I just want to
> check whether there are any other solutions.

Then reverting 458b7c8e0dde and applying this patch still resolves the issue
for me. No errors appear when mappings are unmapped on device close after the
device has been removed, and domain info preserved on device removal is
successfully reused on device re-plug.

Is there anything else I can do to help?

Thanks,
Janusz

>
> Best regards,
> Baolu
>