RE: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally

From: Zhang, Tina
Date: Fri Mar 22 2024 - 00:41:30 EST


Hi Dimitri,


> -----Original Message-----
> From: Dimitri Sivanich <sivanich@xxxxxxx>
> Sent: Friday, March 22, 2024 4:51 AM
> To: Thomas Gleixner <tglx@xxxxxxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>;
> Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx>; Will Deacon
> <will@xxxxxxxxxx>; Robin Murphy <robin.murphy@xxxxxxx>; David
> Woodhouse <dwmw2@xxxxxxxxxxxxx>; Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>;
> Mark Rutland <mark.rutland@xxxxxxx>; Peter Zijlstra
> <peterz@xxxxxxxxxxxxx>; Arnd Bergmann <arnd@xxxxxxxx>; YueHaibing
> <yuehaibing@xxxxxxxxxx>; iommu@xxxxxxxxxxxxxxx; Dimitri Sivanich
> <sivanich@xxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx; Steve Wahl <steve.wahl@xxxxxxx>;
> Anderson, Russ <russ.anderson@xxxxxxx>
> Subject: [PATCH v2] iommu/vt-d: Allocate DMAR fault interrupts locally
>
> The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
> vectors on the boot cpu. On large systems with high DMAR counts this
> results in vector exhaustion, and most of the vectors are not initially allocated
> socket local.
>
> Instead, have a cpu on each node do the vector allocation for the DMARs on
> that node. The boot cpu still does the allocation for its node during its boot
> sequence.
>
> Signed-off-by: Dimitri Sivanich <sivanich@xxxxxxx>
> ---
>
> v2: per Thomas Gleixner, implement this from a DYN CPU hotplug state,
> though
> this implementation runs in CPUHP_AP_ONLINE_DYN space rather than
> CPUHP_BP_PREPARE_DYN space.
>
> drivers/iommu/amd/amd_iommu.h | 2 +-
> drivers/iommu/amd/init.c | 2 +-
> drivers/iommu/intel/dmar.c | 9 +++++++--
> drivers/iommu/irq_remapping.c | 5 ++++- drivers/iommu/irq_remapping.h |
> 2 +-
> include/linux/dmar.h | 2 +-
> 6 files changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/iommu/amd/amd_iommu.h
> b/drivers/iommu/amd/amd_iommu.h index f482aab420f7..410c360e7e24
> 100644
> --- a/drivers/iommu/amd/amd_iommu.h
> +++ b/drivers/iommu/amd/amd_iommu.h
> @@ -33,7 +33,7 @@ int amd_iommu_prepare(void); int
> amd_iommu_enable(void); void amd_iommu_disable(void); int
> amd_iommu_reenable(int mode); -int amd_iommu_enable_faulting(void);
> +int amd_iommu_enable_faulting(unsigned int cpu);
> extern int amd_iommu_guest_ir;
> extern enum io_pgtable_fmt amd_iommu_pgtable; extern int
> amd_iommu_gpt_level; diff --git a/drivers/iommu/amd/init.c
> b/drivers/iommu/amd/init.c index e7a44929f0da..4782f690ed97 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -3389,7 +3389,7 @@ int amd_iommu_reenable(int mode)
> return 0;
> }
>
> -int __init amd_iommu_enable_faulting(void)
> +int __init amd_iommu_enable_faulting(unsigned int cpu)
> {
> /* We enable MSI later when PCI is initialized */
> return 0;
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index
> 36d7427b1202..7644a42f283c 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -2122,7 +2122,7 @@ int dmar_set_interrupt(struct intel_iommu
> *iommu)
> return ret;
> }
>
> -int __init enable_drhd_fault_handling(void)
> +int enable_drhd_fault_handling(unsigned int cpu)
> {
> struct dmar_drhd_unit *drhd;
> struct intel_iommu *iommu;
> @@ -2132,7 +2132,12 @@ int __init enable_drhd_fault_handling(void)
> */
> for_each_iommu(iommu, drhd) {
> u32 fault_status;
> - int ret = dmar_set_interrupt(iommu);
> + int ret;
> +
> + if (iommu->irq || iommu->node != cpu_to_node(cpu))
> + continue;
If iommu->irq is set, current logic will clear any previous faults by accessing DMAR_FSTS_REG. However, the code change in this patch seems missing it.

The current logic:
int dmar_set_interrupt(struct intel_iommu *iommu)
{
int irq, ret;

/*
* Check if the fault interrupt is already initialized.
*/
if (iommu->irq)
return 0;
...

int __init enable_drhd_fault_handling(void)
{
...
for_each_iommu(iommu, drhd) {
u32 fault_status;
int ret = dmar_set_interrupt(iommu);

if (ret) {
pr_err("DRHD %Lx: failed to enable fault, interrupt, ret %d\n",
(unsigned long long)drhd->reg_base_addr, ret);
return -1;
}

/*
* Clear any previous faults.
*/
dmar_fault(iommu->irq, iommu);
fault_status = readl(iommu->reg + DMAR_FSTS_REG);
writel(fault_status, iommu->reg + DMAR_FSTS_REG);
}
...

Regards,

-Tina