Re: [PATCH v2] mm/page_alloc: count CMA pages per zone and print them in /proc/zoneinfo

From: David Rientjes
Date: Thu Jan 28 2021 - 17:29:24 EST


On Thu, 28 Jan 2021, David Hildenbrand wrote:

> > On Thu, 28 Jan 2021, David Hildenbrand wrote:
> >
> >> diff --git a/mm/vmstat.c b/mm/vmstat.c
> >> index 7758486097f9..957680db41fa 100644
> >> --- a/mm/vmstat.c
> >> +++ b/mm/vmstat.c
> >> @@ -1650,6 +1650,11 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
> >> zone->spanned_pages,
> >> zone->present_pages,
> >> zone_managed_pages(zone));
> >> +#ifdef CONFIG_CMA
> >> + seq_printf(m,
> >> + "\n cma %lu",
> >> + zone->cma_pages);
> >> +#endif
> >>
> >> seq_printf(m,
> >> "\n protection: (%ld",
> >
> > Hmm, not sure about this. If cma is only printed for CONFIG_CMA, we can't
> > distinguish between (1) a kernel without your patch without including some
> > version checking and (2) a kernel without CONFIG_CMA enabled. IOW,
> > "cma 0" carries value: we know immediately that we do not have any CMA
> > pages on this zone, period.
> >
> > /proc/zoneinfo is also not known for its conciseness so I think printing
> > "cma 0" even for !CONFIG_CMA is helpful :)
> >
> > I think this #ifdef should be removed and it should call into a
> > zone_cma_pages(struct zone *zone) which returns 0UL if disabled.
> >
>
> Yeah, that’s also what I proposed in a sub-thread here.
>

Ah, I certainly think your original intuition was correct.

> The last option would be going the full mile and not printing nr_free_cma. Code might get a bit uglier though, but we could also remove that stats counter ;)
>
> I don‘t particularly care, while printing „0“ might be easier, removing nr_free_cma might be cleaner.
>
> But then, maybe there are tools that expect that value to be around on any kernel?
>

Yeah, that's probably undue risk, the ship has sailed and there's no
significant upside.

I still think "cma 0" in /proc/zoneinfo carries value, though, especially
for NUMA and it looks like this is how it's done in linux-next. With a
single read of the file, userspace can make the determination what CMA
pages exist on this node.

In general, I think the rule-of-thumb is that the fewer ifdefs in
/proc/zoneinfo, the easier it is for userspace to parse it.

(I made that change to /proc/zoneinfo to even print non-existant zones for
each node because otherwise you cannot determine what the indices of
things like vm.lowmem_reserve_ratio represent.)