RE: [PATCH v3 1/2] gpio: mlxbf2: Introduce IRQ support

From: Asmaa Mnebhi
Date: Wed Sep 29 2021 - 15:14:55 EST



> Asmaa>> Thank you very much for the detailed and clear explanation!
> we only enable/support link up/down interrupts. QA has tested bringing
> up/down the network interface +200 times in a loop.

The micrel driver currently only uses two interrupts of the available 8.
So it will be hard to trigger the problem with the current driver. Your

best way to trigger it is going to bring the link down as soon as it goes up.

So you get first a link up, and then a link down very shortly afterwards.

There is however nothing stopping developers making use of the other interrupts.

That will then increase the likelihood of problems.

What does help you is that the interrupt register is clear on read. So the race condition
window is small.

Asmaa>> Hi Andrew,

I had a meeting today with the HW folks to explain the problem at stake.
The flow for this issue is like this:
1) PHY issues INT_N signal (active low level interrupt)
2) falling edge detected on the GPIO and transmitted to software
3) the first thing mlxbf2_gpio_irq_handler does is to clear the GPIO interrupt.
However even if we clear the GPIO interrupt, the GPIO value itself
will be low as long as the INT_N signal is low. The GPIO HW triggers
the interrupt by detecting the falling edge of the GPIO pin.
4) mlxbf2_gpio_irq_handler triggers phy_interrupt which
calls drv->handler_interrupt.
handle_interrupt in our case = kszphy_handle_interrupt, which reads
MII_KSZPHY_INTCS regs and hence clears all interrupts at once.

- if no other interrupt happens within this time frame, INT_N goes
back to 1 and the next interrupt will trigger another GPIO falling edge

- if the interrupt happens after the MDIO read, then it is not a problem. The
read would have already cleared the register and INT_N would go back to 1.
So the new interrupt will trigger a new GPIO falling edge interrupt.

Problem:
- however, if there is a second interrupt right before or during the MDIO read of
MII_KSZPHY_INTCS, it might not be detected by our GPIO HW.

Anyways, the HW folks agreed that this is a problem since indeed they do not
support LEVEL interrupts on the GPIOs at the moment.
They suggested to read the GPIO pin value to check if it has returned to high
in mlxbf2_gpio_irq_handler, then trigger the phy_interrupt handler.
But I don't think it is a good workaround because there could be a chain
of interrupts which hold the LEVEL low for a long time, and we don't want to
be waiting too long in an interrupt handler routine.
I would greatly appreciate some more feedback on what is the best way to deal
With this in the upstreamed version of the driver.
HW folks said they will fix this in future BlueField generations.


> The software interrupt and handler is not registered based on the GPIO
> interrupt but rather a HW interrupt which is common to all GPIO pins
> (irrelevant here, but this is edge triggered):
> ret = devm_request_irq(dev, irq, mlxbf2_gpio_irq_handler,
> IRQF_SHARED, name, gs);

IRQF_SHARED implied level. You cannot have a shared interrupt which is using edges.

Andrew