Re: [BUG] hard-to-hit mm_struct UAF due to insufficiently careful vma_refcount_put() wrt SLAB_TYPESAFE_BY_RCU

From: Jann Horn
Date: Thu Jul 24 2025 - 10:46:10 EST


On Thu, Jul 24, 2025 at 10:38 AM Vlastimil Babka <vbabka@xxxxxxx> wrote:
> On 7/24/25 04:30, Suren Baghdasaryan wrote:
> > So, I think vma_refcount_put() can mmgrab(vma->mm) before calling
> > __refcount_dec_and_test(), to stabilize that mm and then mmdrop()
> > after it calls rcuwait_wake_up(). What do you think about this
> > approach, folks?
>
> Yeah except it would be wasteful to do for all vma_refcount_put(). Should be
> enough to have this version (as Jann suggested) for inval_end_read: part of
> lock_vma_under_rcu. I think we need it also for the vma_refcount_put() done
> in vma_start_read() when we fail the seqcount check? I think in that case
> the same thing can be happening too, just with different race windows?
>
> Also as Jann suggested, maybe it's not great (or even safe) to perform
> __mmdrop() under rcu? And maybe some vma_start_read() users are even more
> restricted? Maybe then we'd need to make __mmdrop_delayed() not RT-only, and
> use that.

FWIW, I think I have been mixing things up in my head - mmdrop_async()
exists, but this comment in free_signal_struct() explains that it's
because __mmdrop() is not softirq-safe because x86's pgd_lock spinlock
does not disable IRQs.

/*
* __mmdrop is not safe to call from softirq context on x86 due to
* pgd_dtor so postpone it to the async context
*/

So I guess using mmdrop() here might actually be fine, since we're
just in atomic context, not in softirq.