Re: [PATCH 1/2] mm/khugepaged: set THP as uptodate earlier for shmem

From: Peter Xu
Date: Wed Feb 15 2023 - 17:06:43 EST


On Wed, Feb 15, 2023 at 10:33:15AM +0900, David Stevens wrote:
> On Wed, Feb 15, 2023 at 12:44 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> >
> > On Tue, Feb 14, 2023 at 04:57:09PM +0900, David Stevens wrote:
> > > /*
> > > - * At this point the hpage is locked and not up-to-date.
> > > - * It's safe to insert it into the page cache, because nobody would
> > > - * be able to map it or use it in another way until we unlock it.
> > > + * Mark hpage as up-to-date before inserting it into the page cache to
> > > + * prevent it from being mistaken for an fallocated but unwritten page.
> > > + * Inserting the unfinished hpage into the page cache is safe because
> > > + * it is locked, so nobody can map it or use it in another way until we
> > > + * unlock it.
> >
> > No, that's not true. The data has to be there before we mark it
> > uptodate. See filemap_get_pages() for example, used as part of
> > read(). We don't lock the page unless we need to bring it uptodate
> > ourselves.
>
> I've been focusing on the shmem case for collapse_file and forgot to
> think about the !is_shmem case. As far as I could tell, shmem doesn't
> use filemap_get_pages() and everything else in filemap.c/shmem.c that
> checks folio_test_uptodate also locks the folio. But yeah, this would
> break the !is_shmem case and is kind of sketchy anyway. I'll put
> together a better patch.

AFAIU we lock the page iff !uptodate and we want to wait it to be uptodate,
or as Matthew said when we want to modify !uptodate->uptodate.

Take the same example of folio_seek_hole_data() that you mentioned:

if (xa_is_value(folio) || folio_test_uptodate(folio))
return seek_data ? start : end;

--
Peter Xu