Re: [Patch] mm: Increase pagevec size on large system

From: Andrew Morton
Date: Fri Jun 26 2020 - 23:47:07 EST


On Sat, 27 Jun 2020 04:13:04 +0100 Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:

> On Fri, Jun 26, 2020 at 02:23:03PM -0700, Tim Chen wrote:
> > Enlarge the pagevec size to 31 to reduce LRU lock contention for
> > large systems.
> >
> > The LRU lock contention is reduced from 8.9% of total CPU cycles
> > to 2.2% of CPU cyles. And the pmbench throughput increases
> > from 88.8 Mpages/sec to 95.1 Mpages/sec.
>
> The downside here is that pagevecs are often stored on the stack (eg
> truncate_inode_pages_range()) as well as being used for the LRU list.
> On a 64-bit system, this increases the stack usage from 128 to 256 bytes
> for this array.
>
> I wonder if we could do something where we transform the ones on the
> stack to DECLARE_STACK_PAGEVEC(pvec), and similarly DECLARE_LRU_PAGEVEC
> the ones used for the LRUs. There's plenty of space in the header to
> add an unsigned char sz, delete PAGEVEC_SIZE and make it an variable
> length struct.
>
> Or maybe our stacks are now big enough that we just don't care.
> What do you think?

And I wonder how useful CONFIG_NR_CPUS is for making this decision.
Presumably a lot of general-purpose kernel builds have CONFIG_NR_CPUS
much larger than the actual number of CPUs.

I can't think of much of a fix for this, apart from making it larger on
all kernels, Is there a downside to this?