Re: [PATCH] alloc_tag: add per-NUMA node stats

From: Kent Overstreet
Date: Tue Jun 10 2025 - 23:50:12 EST


On Tue, Jun 10, 2025 at 06:33:58PM -0700, Casey Chen wrote:
> On Tue, Jun 10, 2025 at 6:21 PM Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> >
> > On Tue, 10 Jun 2025 17:30:53 -0600 Casey Chen <cachen@xxxxxxxxxxxxxxx> wrote:
> >
> > > Add support for tracking per-NUMA node statistics in /proc/allocinfo.
> > > Previously, each alloc_tag had a single set of counters (bytes and
> > > calls), aggregated across all CPUs. With this change, each CPU can
> > > maintain separate counters for each NUMA node, allowing finer-grained
> > > memory allocation profiling.
> > >
> > > This feature is controlled by the new
> > > CONFIG_MEM_ALLOC_PROFILING_PER_NUMA_STATS option:
> > >
> > > * When enabled (=y), the output includes per-node statistics following
> > > the total bytes/calls:
> > >
> > > <size> <calls> <tag info>
> > > ...
> > > 315456 9858 mm/dmapool.c:338 func:pool_alloc_page
> > > nid0 94912 2966
> > > nid1 220544 6892
> > > 7680 60 mm/dmapool.c:254 func:dma_pool_create
> > > nid0 4224 33
> > > nid1 3456 27
> > >
> > > * When disabled (=n), the output remains unchanged:
> > > <size> <calls> <tag info>
> > > ...
> > > 315456 9858 mm/dmapool.c:338 func:pool_alloc_page
> > > 7680 60 mm/dmapool.c:254 func:dma_pool_create
> > >
> > > To minimize memory overhead, per-NUMA stats counters are dynamically
> > > allocated using the percpu allocator. PERCPU_DYNAMIC_RESERVE has been
> > > increased to ensure sufficient space for in-kernel alloc_tag counters.
> > >
> > > For in-kernel alloc_tag instances, pcpu_alloc_noprof() is used to
> > > allocate counters. These allocations are excluded from the profiling
> > > statistics themselves.
> >
> > What is glaringly missing here is "why".
> >
> > What is the use case? Why does Linux want this? What benefit does
> > this bring to our users? This is the most important part of the
> > changelog because it tells Andrew why he is even looking at this patch.
> >
> >
> > Probably related to the above omission: why per-nid? It would be more
> > flexible to present the per-cpu counts and let userspace aggregate that
> > into per-node info if that is desirable.
> >
>
> Hi Andrew,
>
> Thanks for taking time reviewing my patch. Sorry I didn't include you
> in the previous conversion. See
> https://lore.kernel.org/all/CAJuCfpHhSUhxer-6MP3503w6520YLfgBTGp7Q9Qm9kgN4TNsfw@xxxxxxxxxxxxxx/T/#u

It's good practice to add lore links to any and all previous discussion
to the commit message for the latest patch, like so:

Link: https://lore.kernel.org/all/CAJuCfpHhSUhxer-6MP3503w6520YLfgBTGp7Q9Qm9kgN4TNsfw@xxxxxxxxxxxxxx/T/#u

Make sure to give as much as context as possible - and your commit
message should always include _rationale_ - none of us can keep up with
everything :)