Re: [PATCH] irqchip/sifive-plic: add irq_mask and irq_unmask

From: Palmer Dabbelt
Date: Mon Sep 16 2019 - 18:18:04 EST


On Mon, 16 Sep 2019 14:33:23 PDT (-0700), maz@xxxxxxxxxx wrote:
On Mon, 16 Sep 2019 13:51:58 -0700 (PDT)
Palmer Dabbelt <palmer@xxxxxxxxxx> wrote:

On Mon, 16 Sep 2019 12:04:56 PDT (-0700), Darius Rad wrote:
> On 9/15/19 2:20 PM, Marc Zyngier wrote:
>> On Sun, 15 Sep 2019 18:31:33 +0100,
>> Palmer Dabbelt <palmer@xxxxxxxxxx> wrote:
>>
>> Hi Palmer,
>>
>>>
>>> On Sun, 15 Sep 2019 07:24:20 PDT (-0700), maz@xxxxxxxxxx wrote:
>>>> On Thu, 12 Sep 2019 22:40:34 +0100,
>>>> Darius Rad <darius@xxxxxxxxxxxx> wrote:
>>>>
>>>> Hi Darius,
>>>>
>>>>>
>>>>> As per the existing comment, irq_mask and irq_unmask do not need
>>>>> to do anything for the PLIC. However, the functions must exist
>>>>> (the pointers cannot be NULL) as they are not optional, based on
>>>>> the documentation (Documentation/core-api/genericirq.rst) as well
>>>>> as existing usage (e.g., include/linux/irqchip/chained_irq.h).
>>>>>
>>>>> Signed-off-by: Darius Rad <darius@xxxxxxxxxxxx>
>>>>> ---
>>>>> drivers/irqchip/irq-sifive-plic.c | 13 +++++++++----
>>>>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
>>>>> index cf755964f2f8..52d5169f924f 100644
>>>>> --- a/drivers/irqchip/irq-sifive-plic.c
>>>>> +++ b/drivers/irqchip/irq-sifive-plic.c
>>>>> @@ -111,6 +111,13 @@ static void plic_irq_disable(struct irq_data *d)
>>>>> plic_irq_toggle(cpu_possible_mask, d->hwirq, 0);
>>>>> }
>>>>> +/*
>>>>> + * There is no need to mask/unmask PLIC interrupts. They are "masked"
>>>>> + * by reading claim and "unmasked" when writing it back.
>>>>> + */
>>>>> +static void plic_irq_mask(struct irq_data *d) { }
>>>>> +static void plic_irq_unmask(struct irq_data *d) { }
>>>>
>>>> This outlines a bigger issue. If your irqchip doesn't require
>>>> mask/unmask, you're probably not using the right interrupt
>>>> flow. Looking at the code, I see you're using handle_simple_irq, which
>>>> is almost universally wrong.
>>>>
>>>> As per the description above, these interrupts should be using the
>>>> fasteoi flow, which is designed for this exact behaviour (the
>>>> interrupt controller knows which interrupt is in flight and doesn't
>>>> require SW to do anything bar signalling the EOI).
>>>>
>>>> Another thing is that mask/unmask tends to be a requirement, while
>>>> enable/disable tends to be optional. There is no hard line here, but
>>>> the expectations are that:
>>>>
>>>> (a) A disabled line can drop interrupts
>>>> (b) A masked line cannot drop interrupts
>>>>
>>>> Depending what the PLIC architecture mandates, you'll need to
>>>> implement one and/or the other. Having just (a) is indicative of a HW
>>>> bug, and I'm not assuming that this is the case. (b) only is pretty
>>>> common, and (a)+(b) has a few adepts. My bet is that it requires (b)
>>>> only.
>>>>
>>>>> +
>>>>> #ifdef CONFIG_SMP
>>>>> static int plic_set_affinity(struct irq_data *d,
>>>>> const struct cpumask *mask_val, bool force)
>>>>> @@ -138,12 +145,10 @@ static int plic_set_affinity(struct irq_data *d,
>>>>> static struct irq_chip plic_chip = {
>>>>> .name = "SiFive PLIC",
>>>>> - /*
>>>>> - * There is no need to mask/unmask PLIC interrupts. They are "masked"
>>>>> - * by reading claim and "unmasked" when writing it back.
>>>>> - */
>>>>> .irq_enable = plic_irq_enable,
>>>>> .irq_disable = plic_irq_disable,
>>>>> + .irq_mask = plic_irq_mask,
>>>>> + .irq_unmask = plic_irq_unmask,
>>>>> #ifdef CONFIG_SMP
>>>>> .irq_set_affinity = plic_set_affinity,
>>>>> #endif
>>>>
>>>> Can you give the following patch a go? It brings the irq flow in line
>>>> with what the HW can do. It is of course fully untested (not even
>>>> compile tested...).
>>>>
>>>> Thanks,
>>>>
>>>> M.
>>>>
>>>> From c0ce33a992ec18f5d3bac7f70de62b1ba2b42090 Mon Sep 17 00:00:00 2001
>>>> From: Marc Zyngier <maz@xxxxxxxxxx>
>>>> Date: Sun, 15 Sep 2019 15:17:45 +0100
>>>> Subject: [PATCH] irqchip/sifive-plic: Switch to fasteoi flow
>>>>
>>>> The SiFive PLIC interrupt controller seems to have all the HW
>>>> features to support the fasteoi flow, but the driver seems to be
>>>> stuck in a distant past. Bring it into the 21st century.
>>>
>>> Thanks. We'd gotten these comments during the review process but
>>> nobody had gotten the time to actually fix the issues.
>>
>> No worries. The IRQ subsystem is an acquired taste... ;-)
>>
>>>>
>>>> Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
>>>> ---
>>>> drivers/irqchip/irq-sifive-plic.c | 29 +++++++++++++++--------------
>>>> 1 file changed, 15 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c
>>>> index cf755964f2f8..8fea384d392b 100644
>>>> --- a/drivers/irqchip/irq-sifive-plic.c
>>>> +++ b/drivers/irqchip/irq-sifive-plic.c
>>>> @@ -97,7 +97,7 @@ static inline void plic_irq_toggle(const struct cpumask *mask,
>>>> }
>>>> }
>>>> -static void plic_irq_enable(struct irq_data *d)
>>>> +static void plic_irq_mask(struct irq_data *d)
>>
>> Of course, this is wrong. The perks of trying to do something at the
>> last minute while boarding an airplane. Don't do that.
>>
>> This should of course read "plic_irq_unmask"...
>>
>>>> {
>>>> unsigned int cpu = cpumask_any_and(irq_data_get_affinity_mask(d),
>>>> cpu_online_mask);
>>>> @@ -106,7 +106,7 @@ static void plic_irq_enable(struct irq_data *d)
>>>> plic_irq_toggle(cpumask_of(cpu), d->hwirq, 1);
>>>> }
>>>> -static void plic_irq_disable(struct irq_data *d)
>>>> +static void plic_irq_unmask(struct irq_data *d)
>>
>> ... and this should be "plic_irq_mask".
>>
>> [...]
>>
>>> Reviewed-by: Palmer Dabbelt <palmer@xxxxxxxxxx>
>>> Tested-by: Palmer Dabbelt <palmer@xxxxxxxxxx> (QEMU Boot)
>>
>> Huhuh... It may be that QEMU doesn't implement the full-fat PLIC, as
>> the above bug should have kept the IRQ lines masked.
>>
>>> We should test them on the hardware, but I don't have any with me
>>> right now. David's probably in the best spot to do this, as he's got
>>> a setup that does all the weird interrupt sources (ie, PCIe).
>>>
>>> David: do you mind testing this? I've put the patch here:
>>>
>>> ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/palmer/linux.git
>>> -b plic-fasteoi
>>
>> I've pushed out a branch with the fixed patch:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git irq/plic-fasteoi
>>
>
> That patch works for me on real-ish hardware. I tried on two FPGA
> systems that have different PLIC implementations. Both include
> a PCIe root port (and associated interrupt source). So for
> whatever it's worth:
>
> Tested-by: Darius Rad <darius@xxxxxxxxxxxx>

Awesome, thanks. Would it be OK to put a "(on two hardware PLIC
implementations)" after that, just so we're clear as to who tested
what?

Sure, no problem.

Assuming one of yours wasn't a SiFive PLIC then it'd be great if
David could still give this a whack, but I don't think it strictly
needs to block merging the patch. I've included the right David this
time, with any luck that will be more fruitful :)

Well, we still have time before -rc1. Once David gets a chance to test
it, I'll apply it. Additional question: do you want this backported to
-stable? If so, how far?

Generally I've just been CCing stable on bug fixes and letting the automation handle it. 4.19 was the first version with a PLIC driver, and that was pretty broken so this should be deep down in the noise for anyone who's trying to use that -- and I doubt anyone is trying to do so.


Thanks,

M.