Re: [PATCH v3 02/19] mm: memcg: prepare for byte-sized vmstat items

From: Vlastimil Babka
Date: Wed May 20 2020 - 07:31:32 EST


On 4/22/20 10:46 PM, Roman Gushchin wrote:
> To implement per-object slab memory accounting, we need to
> convert slab vmstat counters to bytes. Actually, out of
> 4 levels of counters: global, per-node, per-memcg and per-lruvec
> only two last levels will require byte-sized counters.
> It's because global and per-node counters will be counting the
> number of slab pages, and per-memcg and per-lruvec will be
> counting the amount of memory taken by charged slab objects.
>
> Converting all vmstat counters to bytes or even all slab
> counters to bytes would introduce an additional overhead.
> So instead let's store global and per-node counters
> in pages, and memcg and lruvec counters in bytes.
>
> To make the API clean all access helpers (both on the read
> and write sides) are dealing with bytes.
>
> To avoid back-and-forth conversions a new flavor of helpers
> is introduced, which always returns values in pages:
> node_page_state_pages() and global_node_page_state_pages().
>
> Actually new helpers are just reading raw values. Old helpers are
> simple wrappers, which perform a conversion if the vmstat items are
> in bytes. Because at the moment no one actually need bytes,
> there are WARN_ON_ONCE() macroses inside to warn about inappropriate
> use cases.
>
> Thanks to Johannes Weiner for the idea of having the byte-sized API
> on top of the page-sized internal storage.
>
> Signed-off-by: Roman Gushchin <guro@xxxxxx>

Reviewed-By: Vlastimil Babka <vbabka@xxxxxxx>

But it's somewhat complicated, so it would be great to document it in comments
of e.g. include/linux/vmstat.h that what the API returns as unsigned long, can
be either bytes or pages depending on vmstat_item_in_bytes().

> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -204,6 +204,11 @@ enum node_stat_item {
> NR_VM_NODE_STAT_ITEMS
> };
>
> +static __always_inline bool vmstat_item_in_bytes(enum node_stat_item item)

This should also have a comment explaining if it's talking about API or storage,
as it's not immediately obvious.

> +{
> + return false;
> +}
> +
> /*
> * We do arithmetic on the LRU lists in various places in the code,
> * so it is important to keep the active lists LRU_ACTIVE higher in