Re: [PATCH RFCv2] mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault/prealloc memory

From: David Hildenbrand
Date: Mon Mar 15 2021 - 09:28:22 EST


On 15.03.21 14:03, Kirill A. Shutemov wrote:
On Mon, Mar 15, 2021 at 01:25:40PM +0100, David Hildenbrand wrote:
On 15.03.21 13:22, Kirill A. Shutemov wrote:
On Mon, Mar 08, 2021 at 05:45:20PM +0100, David Hildenbrand wrote:
+ case -EHWPOISON: /* Skip over any poisoned pages. */
+ start += PAGE_SIZE;
+ continue;

Why is it good approach? It's not abvious to me.

My main motivation was to simplify return code handling. I don't want to
return -EHWPOISON to user space

Why? Hiding the problem under the rug doesn't help anybody. SIGBUS later
is not better than an error upfront.

Well, if you think about "prefaulting page tables", the first intuition is certainly not to check for poisoned pages, right? After all, you are not actually accessing memory, you are allocating memory if required and fill page tables. OTOH, mlock() will also choke on poisoned pages.

With the current semantics, you can start and run a VM just fine. Preallocation/prefaulting succeeded after all. On access you will get a SIGBUS, from which e.g., QEMU can recover by injecting an MCE into the guest - just like if you would hit a poisoned page later.

The problem we are talking about is most probably very rare, especially when using MADV_POPULATE_ for actual preallocation.

I don't have a strong opinion; not bailing out on poisoned pages felt like the right thing to do.

--
Thanks,

David / dhildenb