Re: [RFC PATCH 0/7] Reviving the slab destructor to tackle the percpu allocator scalability problem
From: Christoph Lameter (Ampere)
Date: Thu Apr 24 2025 - 11:55:51 EST
On Thu, 24 Apr 2025, Harry Yoo wrote:
> Consider mm_struct: it allocates two percpu regions (mm_cid and rss_stat),
> so each allocate–free cycle requires two expensive acquire/release on
> that mutex.
> We can mitigate this contention by retaining the percpu regions after
> the object is freed and releasing them only when the backing slab pages
> are freed.
Could you keep a cache of recently used per cpu regions so that you can
avoid frequent percpu allocation operation?
You could allocate larger percpu areas for a batch of them and
then assign as needed.