Re: [PATCH v13 000/137] Memory folios

From: Matthew Wilcox
Date: Mon Jul 12 2021 - 07:37:02 EST


On Mon, Jul 12, 2021 at 06:46:05AM +0100, Christoph Hellwig wrote:
> On Mon, Jul 12, 2021 at 04:04:44AM +0100, Matthew Wilcox (Oracle) wrote:
> > Managing memory in 4KiB pages is a serious overhead. Many benchmarks
> > benefit from a larger "page size". As an example, an earlier iteration
> > of this idea which used compound pages (and wasn't particularly tuned)
> > got a 7% performance boost when compiling the kernel.
> >
> > Using compound pages or THPs exposes a weakness of our type system.
> > Functions are often unprepared for compound pages to be passed to them,
> > and may only act on PAGE_SIZE chunks. Even functions which are aware of
> > compound pages may expect a head page, and do the wrong thing if passed
> > a tail page.
> >
> > We also waste a lot of instructions ensuring that we're not looking at
> > a tail page. Almost every call to PageFoo() contains one or more hidden
> > calls to compound_head(). This also happens for get_page(), put_page()
> > and many more functions.
> >
> > This patch series uses a new type, the struct folio, to manage memory.
> > It converts enough of the page cache, iomap and XFS to use folios instead
> > of pages, and then adds support for multi-page folios. It passes xfstests
> > (running on XFS) with no regressions compared to v5.14-rc1.
>
> This seems to miss a changelog vs the previous version. It also
> includes a lot of the follow ups. I think reviewing a series gets
> rather hard at more than 30-ish patches, so chunking it up a little
> more would be useful.

I'm not seriously expecting anybody to review 137 patches. It's more
for the bots to chew on (which they have done and I'm about to look
at their output). I'll be sending mergable subsets (three rounds; the
base code, the memcg series and the pagecache series) later this week,
once I've addressed the build bot complaints. You've seen all those
patches individually by now.

My plan is that once those are merged, the rest can proceed in parallel.
The block + iomap series is independent, then there's the second pagecache
series. The last dozen or so patches still need a bit of work as they
were pulled across from the THP tree and at least need better changelogs.

Since this works for me, I'm hoping some people will also test and
confirm it works for them, and maybe post their own performance numbers
to justify all this.