Re: [PATCH] mm, kasan: don't poison boot memory

From: Andrey Konovalov
Date: Thu Feb 18 2021 - 14:58:20 EST


On Thu, Feb 18, 2021 at 9:55 AM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 17.02.21 21:56, Andrey Konovalov wrote:
> > During boot, all non-reserved memblock memory is exposed to the buddy
> > allocator. Poisoning all that memory with KASAN lengthens boot time,
> > especially on systems with large amount of RAM. This patch makes
> > page_alloc to not call kasan_free_pages() on all new memory.
> >
> > __free_pages_core() is used when exposing fresh memory during system
> > boot and when onlining memory during hotplug. This patch adds a new
> > FPI_SKIP_KASAN_POISON flag and passes it to __free_pages_ok() through
> > free_pages_prepare() from __free_pages_core().
> >
> > This has little impact on KASAN memory tracking.
> >
> > Assuming that there are no references to newly exposed pages before they
> > are ever allocated, there won't be any intended (but buggy) accesses to
> > that memory that KASAN would normally detect.
> >
> > However, with this patch, KASAN stops detecting wild and large
> > out-of-bounds accesses that happen to land on a fresh memory page that
> > was never allocated. This is taken as an acceptable trade-off.
> >
> > All memory allocated normally when the boot is over keeps getting
> > poisoned as usual.
> >
> > Signed-off-by: Andrey Konovalov <andreyknvl@xxxxxxxxxx>
> > Change-Id: Iae6b1e4bb8216955ffc14af255a7eaaa6f35324d
>
> Not sure this is the right thing to do, see
>
> https://lkml.kernel.org/r/bcf8925d-0949-3fe1-baa8-cc536c529860@xxxxxxxxxx
>
> Reversing the order in which memory gets allocated + used during boot
> (in a patch by me) might have revealed an invalid memory access during boot.
>
> I suspect that that issue would no longer get detected with your patch,
> as the invalid memory access would simply not get detected. Now, I
> cannot prove that :)

This looks like a good example.

Ok, what we can do is:

1. For KASAN_GENERIC: leave everything as is to be able to detect
these boot-time bugs.

2. For KASAN_SW_TAGS: remove boot-time poisoning via
kasan_free_pages(), but use the "invalid" tag as the default shadow
value. The end result should be the same: bad accesses will be
detected. For unallocated memory as it has the default "invalid" tag,
and for allocated memory as it's poisoned properly when
allocated/freed.

3. For KASAN_HW_TAGS: just remove boot-time poisoning via
kasan_free_pages(). As the memory tags have a random unspecified
value, we'll still have a 15/16 chance to detect a memory corruption.

This also makes sense from the performance perspective: KASAN_GENERIC
isn't meant to be running in production, so having a larger perf
impact is acceptable. The other two modes will be faster.