Re: [PATCH v2] mm: fix the inaccurate memory statistics issue for users

From: Andrew Morton
Date: Mon Jun 09 2025 - 20:18:10 EST


On Mon, 9 Jun 2025 10:56:46 +0200 Vlastimil Babka <vbabka@xxxxxxx> wrote:

> On 6/9/25 10:52 AM, Vlastimil Babka wrote:
> > On 6/9/25 10:31 AM, Ritesh Harjani (IBM) wrote:
> >> Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx> writes:
> >>
> >>> On 2025/6/9 15:35, Michal Hocko wrote:
> >>>> On Mon 09-06-25 10:57:41, Ritesh Harjani wrote:
> >>>>>
> >>>>> Any reason why we dropped the Fixes tag? I see there were a series of
> >>>>> discussion on v1 and it got concluded that the fix was correct, then why
> >>>>> drop the fixes tag?
> >>>>
> >>>> This seems more like an improvement than a bug fix.
> >>>
> >>> Yes. I don't have a strong opinion on this, but we (Alibaba) will
> >>> backport it manually,
> >>>
> >>> because some of user-space monitoring tools depend
> >>> on these statistics.
> >>
> >> That sounds like a regression then, isn't it?
> >
> > Hm if counters were accurate before f1a7941243c1 and not afterwards, and
> > this is making them accurate again, and some userspace depends on it,
> > then Fixes: and stable is probably warranted then. If this was just a
> > perf improvement, then not. But AFAIU f1a7941243c1 was the perf
> > improvement...
>
> Dang, should have re-read the commit log of f1a7941243c1 first. It seems
> like the error margin due to batching existed also before f1a7941243c1.
>
> " This patch converts the rss_stats into percpu_counter to convert the
> error margin from (nr_threads * 64) to approximately (nr_cpus ^ 2)."
>
> so if on some systems this means worse margin than before, the above
> "if" chain of thought might still hold.

f1a7941243c1 seems like a good enough place to tell -stable
maintainers where to insert the patch (why does this sound rude).

The patch is simple enough. I'll add fixes:f1a7941243c1 and cc:stable
and, as the problem has been there for years, I'll leave the patch in
mm-unstable so it will eventually get into LTS, in a well tested state.