Re: [PATCH v9 0/10] iommu/vt-d: Fix intel vt-d faults in kdump kernel

From: Dave Young
Date: Thu May 07 2015 - 10:00:47 EST


On 04/07/15 at 10:12am, Don Dutile wrote:
> On 04/06/2015 11:46 PM, Dave Young wrote:
> >On 04/05/15 at 09:54am, Baoquan He wrote:
> >>On 04/03/15 at 05:21pm, Dave Young wrote:
> >>>On 04/03/15 at 05:01pm, Li, ZhenHua wrote:
> >>>>Hi Dave,
> >>>>
> >>>>There may be some possibilities that the old iommu data is corrupted by
> >>>>some other modules. Currently we do not have a better solution for the
> >>>>dmar faults.
> >>>>
> >>>>But I think when this happens, we need to fix the module that corrupted
> >>>>the old iommu data. I once met a similar problem in normal kernel, the
> >>>>queue used by the qi_* functions was written again by another module.
> >>>>The fix was in that module, not in iommu module.
> >>>
> >>>It is too late, there will be no chance to save vmcore then.
> >>>
> >>>Also if it is possible to continue corrupt other area of oldmem because
> >>>of using old iommu tables then it will cause more problems.
> >>>
> >>>So I think the tables at least need some verifycation before being used.
> >>>
> >>
> >>Yes, it's a good thinking anout this and verification is also an
> >>interesting idea. kexec/kdump do a sha256 calculation on loaded kernel
> >>and then verify this again when panic happens in purgatory. This checks
> >>whether any code stomps into region reserved for kexec/kernel and corrupt
> >>the loaded kernel.
> >>
> >>If this is decided to do it should be an enhancement to current
> >>patchset but not a approach change. Since this patchset is going very
> >>close to point as maintainers expected maybe this can be merged firstly,
> >>then think about enhancement. After all without this patchset vt-d often
> >>raised error message, hung.
> >
> >It does not convince me, we should do it right at the beginning instead of
> >introduce something wrong.
> >
> >I wonder why the old dma can not be remap to a specific page in kdump kernel
> >so that it will not corrupt more memory. But I may missed something, I will
> >looking for old threads and catch up.
> >
> >Thanks
> >Dave
> >
> The (only) issue is not corruption, but once the iommu is re-configured, the old,
> not-stopped-yet, dma engines will use iova's that will generate dmar faults, which
> will be enabled when the iommu is re-configured (even to a single/simple paging scheme)
> in the kexec kernel.
>

Don, so if iommu is not reconfigured then these faults will not happen?

Baoquan and me has a confusion below today about iommu=off/intel_iommu=off:

intel_iommu_init()
{
...

dmar_table_init();

disable active iommu translations;

if (no_iommu || dmar_disabled)
goto out_free_dmar;

...
}

Any reason not move no_iommu check to the begining of intel_iommu_init function?

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/