Re: [PATCH] mm, kasan: don't poison boot memory

From: David Hildenbrand
Date: Thu Feb 18 2021 - 15:02:23 EST


On 18.02.21 20:40, Andrey Konovalov wrote:
On Thu, Feb 18, 2021 at 9:55 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 17.02.21 21:56, Andrey Konovalov wrote:
During boot, all non-reserved memblock memory is exposed to the buddy
allocator. Poisoning all that memory with KASAN lengthens boot time,
especially on systems with large amount of RAM. This patch makes
page_alloc to not call kasan_free_pages() on all new memory.

__free_pages_core() is used when exposing fresh memory during system
boot and when onlining memory during hotplug. This patch adds a new
FPI_SKIP_KASAN_POISON flag and passes it to __free_pages_ok() through
free_pages_prepare() from __free_pages_core().

This has little impact on KASAN memory tracking.

Assuming that there are no references to newly exposed pages before they
are ever allocated, there won't be any intended (but buggy) accesses to
that memory that KASAN would normally detect.

However, with this patch, KASAN stops detecting wild and large
out-of-bounds accesses that happen to land on a fresh memory page that
was never allocated. This is taken as an acceptable trade-off.

All memory allocated normally when the boot is over keeps getting
poisoned as usual.

Signed-off-by: Andrey Konovalov <andreyknvl@xxxxxxxxxx>
Change-Id: Iae6b1e4bb8216955ffc14af255a7eaaa6f35324d

Not sure this is the right thing to do, see

https://lkml.kernel.org/r/bcf8925d-0949-3fe1-baa8-cc536c529860@xxxxxxxxxx

Reversing the order in which memory gets allocated + used during boot
(in a patch by me) might have revealed an invalid memory access during boot.

I suspect that that issue would no longer get detected with your patch,
as the invalid memory access would simply not get detected. Now, I
cannot prove that :)

This looks like a good example.

Ok, what we can do is:

1. For KASAN_GENERIC: leave everything as is to be able to detect
these boot-time bugs.

2. For KASAN_SW_TAGS: remove boot-time poisoning via
kasan_free_pages(), but use the "invalid" tag as the default shadow
value. The end result should be the same: bad accesses will be
detected. For unallocated memory as it has the default "invalid" tag,
and for allocated memory as it's poisoned properly when
allocated/freed.

3. For KASAN_HW_TAGS: just remove boot-time poisoning via
kasan_free_pages(). As the memory tags have a random unspecified
value, we'll still have a 15/16 chance to detect a memory corruption.

This also makes sense from the performance perspective: KASAN_GENERIC
isn't meant to be running in production, so having a larger perf
impact is acceptable. The other two modes will be faster.

Sounds in principle sane to me.

Side note: I am not sure if anybody runs KASAN in production. Memory is expensive. Feel free to prove me wrong, I'd be very interest in actual users.

--
Thanks,

David / dhildenb