Re: [PATCH] mm: extend max struct page size for kmsan

From: Alexander Potapenko
Date: Mon Jan 30 2023 - 13:00:27 EST


On Mon, Jan 30, 2023 at 2:38 PM Michal Hocko <mhocko@xxxxxxxx> wrote:
>
> On Mon 30-01-23 14:07:26, Arnd Bergmann wrote:
> > From: Arnd Bergmann <arnd@xxxxxxxx>
> >
> > After x86 has enabled support for KMSAN, it has become possible
> > to have larger 'struct page' than was expected when commit
> > 5470dea49f53 ("mm: use mm_zero_struct_page from SPARC on all 64b
> > architectures") was merged:
> >
> > include/linux/mm.h:156:10: warning: no case matching constant switch condition '96'
> > switch (sizeof(struct page)) {
> >
> > Extend the maximum accordingly.
> >
> > Fixes: 5470dea49f53 ("mm: use mm_zero_struct_page from SPARC on all 64b architectures")
> > Fixes: 4ca8cc8d1bbe ("x86: kmsan: enable KMSAN builds for x86")
> > Signed-off-by: Arnd Bergmann <arnd@xxxxxxxx>
>
> Acked-by: Michal Hocko <mhocko@xxxxxxxx>
>
> I haven't really followed KMSAN development but I would have expected
> that it would, like other debugging tools, add its metadata to page_ext
> rather than page directly.

Thanks for the comment!
I was considering page_ext at some point, but managed to convince
myself it didn't suit the purpose well enough.

Right now KMSAN allocates its metadata at boot time, when tearing down memblock.
At that point only a handful of memory ranges exist, and it is pretty
easy to carve out some unused pages for the metadata for those ranges,
then divide the rest evenly and return 1/3 to the system, spending 2/3
to keep the metadata for the returned pages.
I tried allocating the memory lazily (at page_alloc(), for example),
and it turned out to be very tricky because of fragmentation: for an
allocation of a given order, one needs shadow and origin allocations
of the same order [1], and alloc_pages() simply started with ripping
apart the biggest chunk of memory available.

IIRC if we choose to allocate metadata via page_ext, the memory will
be already too fragmented to easily handle it, because it will only
happen once alloc_pages() is available.
We also can't get rid of the shadow/origin pointers in struct page_ext
(storing two 4K-sized arrays in that struct would defeat all the
possible alignments), so we won't save any memory by switching to
page_ext.



[1] - I can go into more details, but the TLDR is that contiguous
pages within the same allocations better have contiguous shadow/origin
pages, otherwise unaligned accesses will corrupt other pages.