Re: [PATCH] ARM: irq: Add IRQ_SET_MASK_OK_DONE handling in migrate_one_irq()

From: Marc Zyngier
Date: Wed Jan 09 2019 - 11:21:52 EST


On 09/01/2019 15:47, Dietmar Eggemann wrote:
> Hi Marc,
>
> On 1/8/19 3:16 PM, Marc Zyngier wrote:
>> Hi Dietmar,
>>
>> On 08/01/2019 13:58, Dietmar Eggemann wrote:
>>> Arm TC2 (multi_v7_defconfig plus CONFIG_ARM_BIG_LITTLE_CPUFREQ=y and
>>> CONFIG_ARM_VEXPRESS_SPC_CPUFREQ=y) fails hotplug stress tests.
>>>
>>> This issue was tracked down to a missing copy of the new affinity
>>> cpumask of the vexpress-spc interrupt into struct
>>> irq_common_data.affinity when the interrupt is migrated in
>>> migrate_one_irq().
>>>
>>> Commit 0407daceedfe ("irqchip/gic: Return IRQ_SET_MASK_OK_DONE in the
>>> set_affinity method") changed the return value of the irq_set_affinity()
>>> function of the GIC from IRQ_SET_MASK_OK to IRQ_SET_MASK_OK_DONE.
>>>
>>> In migrate_one_irq() if the current irq affinity mask and the cpu
>>> online mask do not share any CPU, the affinity mask is set to the cpu
>>> online mask. In this case (ret == true) and when the irq chip
>>> function irq_set_affinity() returns successfully (IRQ_SET_MASK_OK),
>>> struct irq_common_data.affinity should also be updated.
>>>
>>> Add IRQ_SET_MASK_OK_DONE next to IRQ_SET_MASK_OK when checking that the
>>> irq chip function irq_set_affinity() returns successfully.
>>>
>>> Commit 2cb625478f8c ("genirq: Add IRQ_SET_MASK_OK_DONE to support
>>> stacked irqchip") only added IRQ_SET_MASK_OK_DONE handling to
>>> irq_do_set_affinity() in the irq core and not to the Arm32 irq code.
>>>
>>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
>>> ---
>>>
>>> The hotplug issue on Arm TC2 happens because the vexpress-spc interrupt
>>> (irq=22) is affine to CPU0. This occurs since it is setup early when the
>>> cpu_online_mask is still 0.
>>> But the problem with the missing copy of the affinity mask should occur
>>> with every interrupt which is forced to migrate.
>>>
>>> With additional debug in irq_setup_affinity():
>>>
>>> [0.000619] irq_setup_affinity(): irq=17 mask=0 cpu_online_mask=0 set=0-4
>>> [0.007065] irq_setup_affinity(): irq=22 mask=0 cpu_online_mask=0 set=0-4
>>> [3.372907] irq_setup_affinity(): irq=47 mask=0-4 cpu_online_mask=0-4
>>> set=0-4
>>>
>>> cat /proc/interrupts
>>> CPU0 CPU1 CPU2 CPU3 CPU4
>>> 22: 316 0 0 0 0 GIC-0 127
>>> Level vexpress-spc
>>>
>>> cat /proc/irq/22/smp_affinity_list
>>> 0
>
> [...]
>
>>
>> On the arm64 side, we've solved the exact same issue by getting rid of
>> this code and using the generic implementation. See 217d453d473c5
>> ("arm64: fix a migrating irq bug when hotplug cpu"), which uses
>> irq_migrate_all_off_this_cpu instead.
>>
>> I'm not sure there is much value in not using the core code in this case.
>
> Thanks for the hint! Much more elegant! I tried the following on TC2 and
> it worked just fine. I'm not aware on any drawbacks of using the generic
> irq migration for Arm32 as well.

[...]

Sounds great! Can you put it in a proper patch and resend it?

Thanks,

M.
--
Jazz is not dead. It just smells funny...