Re: [PATCH v2 0/3] mm/page_alloc: Remote per-cpu page list drain support

From: Nicolas Saenz Julienne
Date: Tue Nov 30 2021 - 13:09:35 EST


Hi Vlastimil, sorry for the late reply and thanks for your feedback. :)

On Tue, 2021-11-23 at 15:58 +0100, Vlastimil Babka wrote:
> > [1] Other approaches can be found here:
> >
> > - Static branch conditional on nohz_full, no performance loss, the extra
> > config option makes is painful to maintain (v1):
> > https://lore.kernel.org/linux-mm/20210921161323.607817-5-nsaenzju@xxxxxxxxxx/
> >
> > - RCU based approach, complex, yet a bit less taxing performance wise
> > (RFC):
> > https://lore.kernel.org/linux-mm/20211008161922.942459-4-nsaenzju@xxxxxxxxxx/
>
> Hm I wonder if there might still be another alternative possible. IIRC I did
> propose at some point a local drain on the NOHZ cpu before returning to
> userspace, and then avoiding that cpu in remote drains, but tglx didn't like
> the idea of making entering the NOHZ full mode more expensive [1].
>
> But what if we instead set pcp->high = 0 for these cpus so they would avoid
> populating the pcplists in the first place? Then there wouldn't have to be a
> drain at all. On the other hand page allocator operations would not benefit
> from zone lock batching on those cpus. But perhaps that would be acceptable
> tradeoff, as a nohz cpu is expected to run in userspace most of the time,
> and page allocator operations are rare except maybe some initial page
> faults? (I assume those kind of workloads pre-populate and/or mlock their
> address space anyway).

I've looked a bit into this and it seems straightforward. Our workloads
pre-populate everything, and a slight statup performance hit is not that tragic
(I'll measure it nonetheless). The per-cpu nohz_full state at some point will
be dynamic, but the feature seems simple to disable/enable. I'll have to teach
__drain_all_pages(zone, force_all_cpus=true) to bypass this special case
but that's all. I might have a go at this.

Thanks!

--
Nicolás Sáenz