Re: [RFC PATCH] irqchip/gic-v3: wait irq done to set affinity

From: Yipeng Zou
Date: Mon Jan 09 2023 - 07:26:57 EST



在 2023/1/6 19:55, Marc Zyngier 写道:
On Fri, 06 Jan 2023 08:21:36 +0000,
Yipeng Zou <zouyipeng@xxxxxxxxxx> wrote:
Recently we have some problem about gic set affinity in our test.

This patch just aim to make some discuss about this problem.

For now, the implementation of gic set affinity going to take effects
immediately, and without check if any irq are being processed.

So, This leads to some problem, think about this scenario:

1. First, we have an irq was generated by an device.

2. In the processing of this irq(after handle event, before clear
IRQD_IRQ_INPROGRESS flag), we modify the route and the gic takes effect
immediately,at the same time the new one was generated again.
How is that possible?

If it is affected by GICD_IROUTERn (as your patch suggests), then it
is a SPI. If it is a SPI, it has an active state. Which means it
cannot fire again without a deactivation (EOI if EOImode=0, EOI+DIR if
EOImode=1) having taken place.

So either something has deactivated the interrupt without masking it
beforehand, or the active state is not honoured. Either way, this is
wrong.
Yes, agree, There is no possible in SPI case.
3. The new irq will be processing in other cpu which different form the
old one.

4. The new irq going to be discarded because of the flag IRQD_IRQ_INPROGRESS
has been set.

I notice that if we set IRQF_ONESHOT when register the irq, this problem
will gone.

But I'm also thinking about change the gic_set_affinity function, to wait
current irq done on all cpus before gic_write_irouter.
I'm not sure if that's appropriate.
The base architecture should guarantee that this is not a problem,
thanks to the active state. If that was a LPI (which do not have an
active state), that'd be a different problem. But this doesn't seem to
be the case here.

Hi , Thanks for reply very much.

I have rechecked our test. Actually, that was a LPI in out test case.

It cause the problem since its_send_movi command.

I made a mistake when i modified the code.  It should be as follow. Sorry for misleading you.


diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c

index 973ede0197e3..fad08ccb7fd9 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1667,6 +1667,9 @@ static int its_set_affinity(struct irq_data *d, const struct cpumask *mask_val,

        /* don't set the affinity when the target cpu is same as current one */
        if (cpu != prev_cpu) {
+
+               // wait irq done on all cpus
+
                target_col = &its_dev->its->collections[cpu];
                its_send_movi(its_dev, target_col, id);
                its_dev->event_map.col_map[id] = cpu

I'm afraid to say that what you describe seem like a bug of some sort,
either HW or SW.

Thanks,

M.

--
Regards,
Yipeng Zou