Re: [PATCH v4 00/36] Large pages in the page cache

From: Dave Chinner
Date: Mon May 25 2020 - 19:08:47 EST


On Thu, May 21, 2020 at 08:05:53PM -0700, Matthew Wilcox wrote:
> On Fri, May 22, 2020 at 12:57:51PM +1000, Dave Chinner wrote:
> > On Thu, May 21, 2020 at 05:04:11PM -0700, Matthew Wilcox wrote:
> > > On Fri, May 22, 2020 at 08:49:06AM +1000, Dave Chinner wrote:
> > > > Ok, so the main issue I have with the filesystem/iomap side of
> > > > things is that it appears to be adding "transparent huge page"
> > > > awareness to the filesysetm code, not "large page support".
> > > >
> > > > For people that aren't aware of the difference between the
> > > > transparent huge and and a normal compound page (e.g. I have no idea
> > > > what the difference is), this is likely to cause problems,
> > > > especially as you haven't explained at all in this description why
> > > > transparent huge pages are being used rather than bog standard
> > > > compound pages.
> > >
> > > The primary reason to use a different name from compound_*
> > > is so that it can be compiled out for systems that don't enable
> > > CONFIG_TRANSPARENT_HUGEPAGE. So THPs are compound pages, as they always
> > > have been, but for a filesystem, using thp_size() will compile to either
> > > page_size() or PAGE_SIZE depending on CONFIG_TRANSPARENT_HUGEPAGE.
> >
> > Again, why is this dependent on THP? We can allocate compound pages
> > without using THP, so why only allow the page cache to use larger
> > pages when THP is configured?
>
> We have too many CONFIG options. My brain can't cope with adding
> CONFIG_LARGE_PAGES because then we might have neither THP nor LP, LP and
> not THP, THP and not LP or both THP and LP. And of course HUGETLBFS,
> which has its own special set of issues that one has to think about when
> dealing with the page cache.

That sounds like something that should be fixed. :/

Really, I don't care about the historical mechanisms that people can
configure large pages with. If the mm subsystem does not have a
unified abstraction and API for working with large pages, then that
is the first problem that needs to be addressed before other
subsystems start trying to use large pages.

i.e. a filesystem developer doesn't care how the mm subsystem is
allocating/managing large pages, we just want to be able to treat
large pages exactly the same way as we treat single pages. There
should be exactly zero difference between them at the API level.

> So, either large pages becomes part of the base kernel and you
> always get them, or there's a CONFIG option to enable them and it's
> CONFIG_TRANSPARENT_HUGEPAGE. I chose the latter.

Please make the API part of the base kernel. Then you can hide all
these whacky mm level config options behind it so that code that
interacts with large pages just doesn't have to care about what type
of large page infrastructure the user has configured.

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx