Re: [RFC PATCH] genirq: Exclude managed irq during irq migration

From: Chen Yu
Date: Thu Oct 26 2023 - 03:03:18 EST


Hi Thomas,

On 2023-10-25 at 16:34:59 +0200, Thomas Gleixner wrote:
> Chen!
>
> On Fri, Oct 20 2023 at 15:25, Chen Yu wrote:
> > The managed IRQ will be shutdown and not be migrated to
>
> Please write out interrupts in change logs, this is not twitter.
>
> > other CPUs during CPU offline. Later when the CPU is online,
> > the managed IRQ will be re-enabled on this CPU. The managed
> > IRQ can be used to reduce the IRQ migration during CPU hotplug.
> >
> > Before putting the CPU offline, the number of the already allocated
> > IRQs on this offlining CPU will be compared to the total number
>
> The usage of IRQs and vectors is slightly confusing all over the
> place.
>
> > of available IRQ vectors on the remaining online CPUs. If there is
> > not enough slot for these IRQs to be migrated to, the CPU offline
> > will be terminated. However, currently the code treats the managed
> > IRQ as migratable, which is not true, and brings false negative
> > during CPU hotplug and hibernation stress test.
>
> Your assumption that managed interrupts cannot be migrated is only
> correct when the managed interrupts affinity mask has exactly one online
> target CPU. Otherwise the interrupt is migrated to one of the other
> online CPUs in the affinity mask.
>
> Though that does not affect the migrateability calculation because in
> case that a managed interrupt has an affinity mask with more than one
> target CPU set, the vectors on the currently not targeted CPUs are
> already reserved and accounted for in matrix->global_available. IOW,
> migrateability for such managed interrupts is already guaranteed.
>

Got it, the percpu cm->managed has been pre-reserved already and is
excluded from m->global_available. So we still should substract the
number of allocated managed interrupts to avoid duplicated calculation.

> I'll amend the changelog to make this clear.
>

Thanks for helping on this.

thanks,
Chenyu