Re: [PATCH v3 1/2] rust: irq: add support for request_irq()
From: Alice Ryhl
Date: Tue Jun 03 2025 - 04:28:56 EST
On Mon, Jun 02, 2025 at 07:31:00PM +0200, Danilo Krummrich wrote:
> On Mon, Jun 02, 2025 at 04:19:21PM +0000, Alice Ryhl wrote:
> > On Wed, May 14, 2025 at 11:53:21PM +0200, Danilo Krummrich wrote:
> > > On Wed, May 14, 2025 at 04:20:51PM -0300, Daniel Almeida wrote:
> > > > +/// // This is running in process context.
> > > > +/// fn register_irq(irq: u32, handler: Handler) -> Result<Arc<Registration<Handler>>> {
> > > > +/// let registration = Registration::register(irq, flags::SHARED, c_str!("my-device"), handler);
> > > > +///
> > > > +/// // You can have as many references to the registration as you want, so
> > > > +/// // multiple parts of the driver can access it.
> > > > +/// let registration = Arc::pin_init(registration, GFP_KERNEL)?;
> > >
> > > This makes it possible to arbitrarily extend the lifetime of an IRQ
> > > registration. However, we must guarantee that the IRQ is unregistered when the
> > > corresponding device is unbound. We can't allow drivers to hold on to device
> > > resources after the corresponding device has been unbound.
> > >
> > > Why does the data need to be part of the IRQ registration itself? Why can't we
> > > pass in an Arc<T> instance already when we register the IRQ?
> > >
> > > This way we'd never have a reason to ever access the Registration instance
> > > itself ever again and we can easily wrap it as Devres<irq::Registration> -
> > > analogously to devm_request_irq() on the C side - without any penalties.
> >
> > If we step away from the various Rust abstractions for a moment, then it
> > sounds like the request_irq API must follow these rules:
> >
> > 1. Ensure that free_irq is called before the device is unbound.
> > 2. Ensure that associated data remains valid until after free_irq is
> > called.
> >
> > We don't necessarily need to ensure that the Registration object itself
> > is dropped before the device is unbound - as long as free_irq is called
> > in time, it's okay.
> >
> > Now, if we use Devres, the way this is enforced is that during cleanup
> > of a device, we call free_irq *and* we destroy the associated data right
> > afterwards. By also destroying the associated data at that moment, it
> > becomes necessary to use rcu_read_lock() to access the associated data.
> > But if we just don't destroy the associated data during device cleanup,
> > then that requirement goes away.
> >
> > Based on this, we could imagine something along these lines:
> >
> > struct RegistrationInner(i32);
> > impl Drop for RegistrationInner {
> > fn drop(&mut self) {
> > free_irq(...);
> > }
> > }
> >
> > struct Registration<T> {
> > reg: Devres<RegistrationInner>,
> > data: T,
> > }
> >
> > Here you can access the `data` on the registration at any time without
> > synchronization.
>
> I was just about to reply to Daniel proposing exactly this alternative, it's
> equivalent with what I went with in the MiscDeviceRegistration patches for
> supporting the device driver use-case [1].
>
> So, I'm perfectly fine with this approach.
>
> [1] https://lore.kernel.org/lkml/20250530142447.166524-6-dakr@xxxxxxxxxx/
Ah, ok.
> >
> > Note that with this, I don't think the rcu-based devres is really the
> > right choice. It would make more sense to have a mutex along these
> > lines:
> >
> > // drop Registration
> > if devm_remove_callback() {
> > free_irq();
> > } else {
> > mutex_lock();
> > free_irq();
> > mutex_unlock();
> > }
> >
> > // devm callback
> > mutex_lock();
> > free_irq();
> > mutex_unlock();
> >
> > This avoids that really expensive call to synchronize_rcu() in the devm
> > callback.
>
> This would indeed be an optimization for the special case that we never care
> about actually accessing the Revocable, i.e. where we have no need to make the
> "read" have minimal overhead.
>
> However, I think we can do even better and, at the same time, avoid this special
> case optimization and have everything be the common case with what I already
> plan on implementing:
>
> I want that regardless of how many devres callbacks a driver registers by having
> Devres wrapped objects around, there's only a *single* synchronize_rcu() call for
> all of them.
>
> We can achieve this by having the devres callbacks on driver unbind in two
> stages, the first callback flips the Revocable's atomic for for all Devres
> objects, then there is a single synchronize_rcu() and then we drop_in_place()
> all the Revocable's data for all Devres objects.
>
> As mentioned above I plan on implementing this, but given that it's "just" a
> performance optimization for the driver unbind path and given that we're not
> very close to a production driver upstream, it haven't had the highest priority
> for me to implement so far.
That optimization sounds like something we definitely want, but I have
one question: is free_irq() safe to use in atomic context / inside
rcu_read_lock()? What about the threaded-irq variant? After all, don't
they sleep until existing callbacks finish running?
Alice