Re: [PATCH] mm: make allocation counters per-order

From: Mel Gorman
Date: Thu Jul 06 2017 - 09:19:47 EST


On Thu, Jul 06, 2017 at 02:04:31PM +0100, Roman Gushchin wrote:
> High-order allocations are obviously more costly, and it's very useful
> to know how many of them happens, if there are any issues
> (or suspicions) with memory fragmentation.
>
> This commit changes existing per-zone allocation counters to be
> per-zone per-order. These counters are displayed using a new
> procfs interface (similar to /proc/buddyinfo):
>
> $ cat /proc/allocinfo
> DMA 0 0 0 0 0 \
> 0 0 0 0 0 0
> DMA32 3 0 1 0 0 \
> 0 0 0 0 0 0
> Normal 4997056 23594 10902 23686 931 \
> 23 122 786 17 1 0
> Movable 0 0 0 0 0 \
> 0 0 0 0 0 0
> Device 0 0 0 0 0 \
> 0 0 0 0 0 0
>
> The existing vmstat interface remains untouched*, and still shows
> the total number of single page allocations, so high-order allocations
> are represented as a corresponding number of order-0 allocations.
>
> $ cat /proc/vmstat | grep alloc
> pgalloc_dma 0
> pgalloc_dma32 7
> pgalloc_normal 5461660
> pgalloc_movable 0
> pgalloc_device 0
>
> * I've added device zone for consistency with other zones,
> and to avoid messy exclusion of this zone in the code.
>

The alloc counter updates are themselves a surprisingly heavy cost to
the allocation path and this makes it worse for a debugging case that is
relatively rare. I'm extremely reluctant for such a patch to be added
given that the tracepoints can be used to assemble such a monitor even
if it means running a userspace daemon to keep track of it. Would such a
solution be suitable? Failing that if this is a severe issue, would it be
possible to at least make this a compile-time or static tracepoint option?
That way, only people that really need it have to take the penalty.

--
Mel Gorman
SUSE Labs