Re: [PATCH] leds: trigger: fix potential deadlock with libata

From: Marc Kleine-Budde
Date: Sat Mar 06 2021 - 15:40:49 EST


Hello *,

On 02.11.2020 11:41:52, Andrea Righi wrote:
> We have the following potential deadlock condition:
>
> ========================================================
> WARNING: possible irq lock inversion dependency detected
> 5.10.0-rc2+ #25 Not tainted
> --------------------------------------------------------
> swapper/3/0 just changed the state of lock:
> ffff8880063bd618 (&host->lock){-...}-{2:2}, at: ata_bmdma_interrupt+0x27/0x200
> but this lock took another, HARDIRQ-READ-unsafe lock in the past:
> (&trig->leddev_list_lock){.+.?}-{2:2}
>
> and interrupts could create inverse lock ordering between them.

[...]

> ---
> drivers/leds/led-triggers.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/leds/led-triggers.c b/drivers/leds/led-triggers.c
> index 91da90cfb11d..16d1a93a10a8 100644
> --- a/drivers/leds/led-triggers.c
> +++ b/drivers/leds/led-triggers.c
> @@ -378,14 +378,15 @@ void led_trigger_event(struct led_trigger *trig,
> enum led_brightness brightness)
> {
> struct led_classdev *led_cdev;
> + unsigned long flags;
>
> if (!trig)
> return;
>
> - read_lock(&trig->leddev_list_lock);
> + read_lock_irqsave(&trig->leddev_list_lock, flags);
> list_for_each_entry(led_cdev, &trig->led_cdevs, trig_list)
> led_set_brightness(led_cdev, brightness);
> - read_unlock(&trig->leddev_list_lock);
> + read_unlock_irqrestore(&trig->leddev_list_lock, flags);
> }
> EXPORT_SYMBOL_GPL(led_trigger_event);

meanwhile this patch hit v5.10.x stable and caused a performance
degradation on our use case:

It's an embedded ARM system, 4x Cortex A53, with an SPI attached CAN
controller. CAN stands for Controller Area Network and here used to
connect to some automotive equipment. Over CAN an ISOTP (a CAN-specific
Transport Protocol) transfer is running. With this patch, we see CAN
frames delayed for ~6ms, the usual gap between CAN frames is 240µs.

Reverting this patch, restores the old performance.

What is the best way to solve this dilemma? Identify the critical path
in our use case? Is there a way we can get around the irqsave in
led_trigger_event()?

regards,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

Attachment: signature.asc
Description: PGP signature