Re: [PATCH] mm: extend max struct page size for kmsan

From: Alexander Potapenko
Date: Mon Jan 30 2023 - 13:22:16 EST


> > I haven't really followed KMSAN development but I would have expected
> > that it would, like other debugging tools, add its metadata to page_ext
> > rather than page directly.
>
> Yes, that would have been preferable. Also, I don't understand why we
> need an entire page to store whether each "bit" of a page is initialised.
> There are no CPUs which have bit-granularity stores; either you initialise
> an entire byte or not. So that metadata can shrink from 4096 bytes
> to 512.

It's not about bit-granularity stores, it's about bits being
uninitialized or not.

Consider the following struct:

struct foo {
char a:4;
char b:4;
} f;

- if the user initializes f.a and then tries to use f.b, this is still
undefined behavior that KMSAN is able to catch thanks to bit-to-bit
shadow, but would not have been able to detect if we only stored one
bit per byte.
Another example is bit flags or bit masks, where you can set a single
bit in an int32, but that wouldn't necessarily mean the rest of that
variable is initialized.

It's worth mentioning that even if we choose to shrink the shadows
from 4096 to 512 bytes, there'd still be four-byte origin IDs, which
are allocated for every four bytes of program memory.
So a whole page of origins will still be required in addition to those
512 bytes of shadow.

(Origins are handy when debugging KMSAN reports, because a single
uninit value can be copied or modified multiple times before it is
used in a branch or passed to the userspace.
Shrinking origins further would render them useless for e.g. 32-bit
local variables, which is a quite common use case).