Re: [RFC 0/1] memfd: Support mapping to zero page on reading

From: Matthew Wilcox
Date: Tue Jan 11 2022 - 23:32:52 EST


On Tue, Jan 11, 2022 at 06:30:31PM -0800, Hugh Dickins wrote:
> But I have to say that use of ZERO_PAGE for shmem/memfd/tmpfs read-fault
> might (potentially) be very welcome. Not as some MFD_ZEROPAGE special
> case, but as how it would always work. Deleting the shmem_recalc_inode()
> cruft, which is there to correct accounting for the unmodified read-only
> pages, after page reclaim has got around to freeing them later.
>
> It does require more work than you gave it in 1/1: mainly, as you call
> out above, there's a need to note in the mapping's XArray when ZERO_PAGE
> has been used at an offset, and do an rmap walk to unmap those ptes when
> a writable page is substituted - see __xip_unmap() in Linux 3.19's
> mm/filemap_xip.c for such an rmap walk.

I think putting a pointer to the zero page in the XArray would introduce
some unwelcome complexity, but the XArray has a special XA_ZERO_ENTRY
which might be usable for such a thing. It would need some careful
analysis and testing, of course, but it might also let us remove
the special cases in the DAX code for DAX_ZERO_PAGE.

I agree with you that temporarily allocating pages has worked "well
enough", but maybe some workloads would benefit; even for files on block
device filesystems, reading a hole and never writing to it may be common
enough that this is an optimisation we've been missing for many years.