Re: [Bug 202453] TRACE irq/18-i801_smb Tainted when enabled threadirqs in kernel commandline.

From: Oleksandr Natalenko
Date: Fri Dec 04 2020 - 15:20:15 EST


On Thu, Dec 03, 2020 at 07:04:00PM +0000, bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=202453
>
> --- Comment #7 from Thomas Gleixner (tglx@xxxxxxxxxxxxx) ---
> On Tue, Aug 06 2019 at 14:07, bugzilla-daemon wrote:
> > Jean Delvare (jdelvare@xxxxxxx) changed:
> >
> > Is this happening with vanilla kernels or gentoo kernels?
> >
> > Thomas, I'm not sure if this is a bug in the i2c-i801 driver or something
> > more
> > general in how we handle the interrupts under threadirqs. Any suggestion how
> > to
> > investigate this further?
>
> Bah. What happens is that the i2c-i801 driver interrupt handler does:
>
> i801_isr()
>
> ...
> return i801_host_notify_isr(priv);
>
> which invokes:
>
> i2c_handle_smbus_host_notify()
>
> which in turn invokes
>
> generic_handle_irq()
>
> and that explodes with forced interrupt threading because it's called
> with interrupts enabled.
>
> The root of all evil is the usage of generic_handle_irq() under the
> assumption that this is always called from hard interrupt context. That
> assumption is not true since 8d32a307e4fa ("genirq: Provide forced
> interrupt threading") which went into 2.6.39 almost 10 years ago.
>
> Seems you got lucky that since 10 years someone no user of this uses a
> threaded interrupt handler, like some of the i2c drivers actually do. :)
>
> So there are a couple of options to fix this:
>
> 1) Set the IRQF_NO_THREAD flag and replace the waitqueue as that
> won't work on RT.
>
> Looking at the usage it's a single waiter wakeup and a single
> waiter at a time because the xfer is fully serialized by the
> core. So we can switch it to rcuwait, if there would be
> rcu_wait_event_timeout(), but that's fixable.
>
> 2) Have a wrapper around handle_generic_irq() which ensures that
> interrupts are disabled before invoking it.
>
> 3) Make the interrupt which is dispatched there to be requested with
> [devm_]request_any_context_irq(). That also requires that the
> NESTED_THREAD flag is set on the irq descriptor.
>
> That's exactly made for the use case where the dispatching
> interrupt can be either threaded or in hard interrupt context.
>
> But that's lots of churn.
>
> And because we have so many options, here is the question:
>
> Is i2c_handle_smbus_host_notify() guaranteed to be called from hard
> interrupt handlers (assumed that we use #1 above)?
>
> I can't tell because there is also i2c_slave_host_notify_cb() which
> invokes it and my i2c foo is not good enough to figure that out.
>
> If that's the case the #1 would be the straight forward solution. If
> not, then you want #2 because then the problem will just pop up via the
> slave thing and that might be not as trivial to fix as this one.
>
> If you can answer that, I can send you the proper patch :)

tglx suggested moving this to the appropriate mailing lists, so I'mm
Cc'ing those.

Jean, Wolfram, Benjamin, or someone else, could you please check Thomas'
questions above and let us know what you think?

I'll copy-paste my attempt to answer this in bugzilla below:

```
As far as I can grep through bus drivers, yes, it is called from hard
interrupt handlers only.

i2c_handle_smbus_host_notify() is indeed called from
i2c_slave_host_notify_cb(), which, in its turn, is set to be called as
->slave_cb() via i2c_slave_event() wrapper only.

Also, check [1], slide #9. I'm not sure about that "usually" word
though since I couldn't find any examples of "unusual" usage.

/* not an i2c guru here either, just looking around the code */

[1] https://elinux.org/images/f/f6/ELCE15-WolframSang-ShinyNewI2CSlaveFramework.pdf
```

and also tglx' follow-up question:

```
The question is whether it's guaranteed under all circumstances
including forced irq threading. The i801 driver has assumptions about
this, so I wouldn't be surprised if there are more.
```

Thanks.

--
Oleksandr Natalenko (post-factum)