Re: [RFC PATCH 0/7] Reviving the slab destructor to tackle the percpu allocator scalability problem

From: Harry Yoo
Date: Fri Apr 25 2025 - 03:41:29 EST


On Thu, Apr 24, 2025 at 05:20:59PM +0200, Mateusz Guzik wrote:
> On Thu, Apr 24, 2025 at 1:28 PM Pedro Falcato <pfalcato@xxxxxxx> wrote:
> > > How to do this with slab constructors and destructors: the constructor
> > > allocates percpu memory, and the destructor frees it when the slab pages
> > > are reclaimed; this slightly alters the constructor’s semantics,
> > > as it can now fail.
> >
> > I really really really really don't like this. We're opening a pandora's box
> > of locking issues for slab deadlocks and other subtle issues. IMO the best
> > solution there would be, what, failing dtors? which says a lot about the whole
> > situation...
> >
>
> I noted the need to use leaf spin locks in my IRC conversations with
> Harry and later in this very thread, it is a bummer this bit did not
> make into the cover letter -- hopefully it would have avoided this
> exchange.

My bad. Yes, I should have included it in the cover letter.
Will include in the future series.

> I'm going to summarize this again here:
>
> By API contract the dtor can only take a leaf spinlock, in this case one which:
> 1. disables irqs

Alright, as interrupt handlers can also introduce lock dependency.

> 2. is the last lock in the dependency chain, as in no locks are taken
> while holding it

So if the destructor takes a lock, no users of the lock, in any
dependency chain, can hold any locks while holding it.

> That way there is no possibility of a deadlock.
>
> This poses a question on how to enforce it and this bit is easy: for
> example one can add leaf-spinlock notion to lockdep. Then a misuse on
> allocation side is going to complain immediately even without
> triggering reclaim. Further, if one would feel so inclined, a test
> module can walk the list of all slab caches and do a populate/reclaim
> cycle on those with the ctor/dtor pair.
>
> Then there is the matter of particular consumers being ready to do
> what they need to on the dtor side only with the spinlock. Does not
> sound like a fundamental problem.
>
> > Case in point:
> > What happens if you allocate a slab and start ->ctor()-ing objects, and then
> > one of the ctors fails? We need to free the ctor, but not without ->dtor()-ing
> > everything back (AIUI this is not handled in this series, yet).

It is handled in __free_slab().

>> Besides this
> > complication, if failing dtors were added into the mix, we'd be left with a
> > half-initialized slab(!!) in the middle of the cache waiting to get freed,
> > without being able to.
> >
>
> Per my previous paragraph failing dtors would be a self-induced problem.
>
> I can agree one has to roll things back if ctors don't work out, but I
> don't think this poses a significant problem.

Agreed. it'd better to write destructors carefully avoiding deadlocks
rather than allowing it to fail.

--
Cheers,
Harry / Hyeonggon