Re: [PATCH v1] mm/hwpoison: convert TTU_IGNORE_HWPOISON to TTU_HWPOISON

From: Matthew Wilcox
Date: Tue Feb 21 2023 - 08:36:06 EST


On Tue, Feb 21, 2023 at 05:59:05PM +0900, Naoya Horiguchi wrote:
> After a memory error happens on a clean folio, a process unexpectedly
> receives SIGBUS when it accesses to the error page. This SIGBUS killing
> is pointless and simply degrades the level of RAS of the system, because
> the clean folio can be dropped without any data lost on memory error
> handling as we do for a clean pagecache.
>
> When memory_failure() is called on a clean folio, try_to_unmap() is called
> twice (one from split_huge_page() and one from hwpoison_user_mappings()).
> The root cause of the issue is that pte conversion to hwpoisoned entry is
> now done in the first call of try_to_unmap() because PageHWPoison is already
> set at this point, while it's actually expected to be done in the second
> call. This behavior disturbs the error handling operation like removing
> pagecache, which results in the malfunction described above.
>
> So convert TTU_IGNORE_HWPOISON into TTU_HWPOISON and set TTU_HWPOISON only
> when we really intend to convert pte to hwpoison entry. This can prevent
> other callers of try_to_unmap() from accidentally converting to hwpoison
> entries.
>
> Fixes: a42634a6c07d ("readahead: Use a folio in read_pages()")

How did you choose this Fixes tag?