Re: [RFC PATCH v4 11/13] mm: parallelize deferred struct page initialization within each node

From: Daniel Jordan
Date: Mon Nov 19 2018 - 11:30:25 EST


On Mon, Nov 12, 2018 at 08:54:12AM -0800, Daniel Jordan wrote:
> On Sat, Nov 10, 2018 at 03:48:14AM +0000, Elliott, Robert (Persistent Memory) wrote:
> > > -----Original Message-----
> > > From: linux-kernel-owner@xxxxxxxxxxxxxxx <linux-kernel-
> > > owner@xxxxxxxxxxxxxxx> On Behalf Of Daniel Jordan
> > > Sent: Monday, November 05, 2018 10:56 AM
> > > Subject: [RFC PATCH v4 11/13] mm: parallelize deferred struct page
> > > initialization within each node
> > >
> > > ... The kernel doesn't
> > > know the memory bandwidth of a given system to get the most efficient
> > > number of threads, so there's some guesswork involved.
> >
> > The ACPI HMAT (Heterogeneous Memory Attribute Table) is designed to report
> > that kind of information, and could facilitate automatic tuning.
> >
> > There was discussion last year about kernel support for it:
> > https://lore.kernel.org/lkml/20171214021019.13579-1-ross.zwisler@xxxxxxxxxxxxxxx/
>
> Thanks for bringing this up. I'm traveling but will take a closer look when I
> get back.

So this series would give the total bandwidth for a memory target, but there's
not a way to map that to a CPU count. In other words, it seems we couldn't
determine how many CPUs it takes to reach the max bandwidth. If I haven't
missed something, I'm going to remove that comment.