Re: [PATCH] mm: extend max struct page size for kmsan

From: Michal Hocko
Date: Tue Jan 31 2023 - 10:16:41 EST


On Mon 30-01-23 18:59:45, Alexander Potapenko wrote:
> On Mon, Jan 30, 2023 at 2:38 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
> >
> > On Mon 30-01-23 14:07:26, Arnd Bergmann wrote:
> > > From: Arnd Bergmann <arnd@xxxxxxxx>
> > >
> > > After x86 has enabled support for KMSAN, it has become possible
> > > to have larger 'struct page' than was expected when commit
> > > 5470dea49f53 ("mm: use mm_zero_struct_page from SPARC on all 64b
> > > architectures") was merged:
> > >
> > > include/linux/mm.h:156:10: warning: no case matching constant switch condition '96'
> > > switch (sizeof(struct page)) {
> > >
> > > Extend the maximum accordingly.
> > >
> > > Fixes: 5470dea49f53 ("mm: use mm_zero_struct_page from SPARC on all 64b architectures")
> > > Fixes: 4ca8cc8d1bbe ("x86: kmsan: enable KMSAN builds for x86")
> > > Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx>
> >
> > Acked-by: Michal Hocko <mhocko@xxxxxxxx>
> >
> > I haven't really followed KMSAN development but I would have expected
> > that it would, like other debugging tools, add its metadata to page_ext
> > rather than page directly.
>
> Thanks for the comment!
> I was considering page_ext at some point, but managed to convince
> myself it didn't suit the purpose well enough.
>
> Right now KMSAN allocates its metadata at boot time, when tearing down memblock.
> At that point only a handful of memory ranges exist, and it is pretty
> easy to carve out some unused pages for the metadata for those ranges,
> then divide the rest evenly and return 1/3 to the system, spending 2/3
> to keep the metadata for the returned pages.
> I tried allocating the memory lazily (at page_alloc(), for example),
> and it turned out to be very tricky because of fragmentation: for an
> allocation of a given order, one needs shadow and origin allocations
> of the same order [1], and alloc_pages() simply started with ripping
> apart the biggest chunk of memory available.

page_ext allocation happens quite early as well. There shouldn't be any
real fragmentation that early during the boot.

> IIRC if we choose to allocate metadata via page_ext, the memory will
> be already too fragmented to easily handle it, because it will only
> happen once alloc_pages() is available.
> We also can't get rid of the shadow/origin pointers in struct page_ext
> (storing two 4K-sized arrays in that struct would defeat all the
> possible alignments), so we won't save any memory by switching to
> page_ext.

With page_ext you would allow to compile the feature in disabled by
default and allow to boot time enable it.

> [1] - I can go into more details, but the TLDR is that contiguous
> pages within the same allocations better have contiguous shadow/origin
> pages, otherwise unaligned accesses will corrupt other pages.

--
Michal Hocko
SUSE Labs